Python Developer Manual¶

The following guide provides step-by-step instructions to get started writing Python applications using Kinetica. This guide demonstrates only a small set of the available API. A detailed description of the complete interface is available under Python API Reference.

API Download and Installation¶

Requirements

Python 2.7 (or greater)
pip

Note

The locations of python and pip should be placed in the PATH environment variable. If they are not, you'll need to use the full path to the python and pip executables in the instructions below. Also, administrative access will most likely be required when installing the Python packages.

Download¶

Using Git¶

In the desired directory, run the following but be sure to replace kinetica-version with the correct branch name of the installed Kinetica version:
```
git clone -b [kinetica-version] --single-branch https://github.com/kineticadb/kinetica-api-python.git
```
Change directory into the newly downloaded repository.

Manually¶

Navigate to the Kinetica Python API repository.
Change the branch to the corresponding version of Kinetica you have installed.
Click Clone or download, and then click Download Zip.
Unzip the downloaded file and and cd into the resulting directory.

Installation¶

In the root directory of the unzipped repository, install the Kinetica API and Avro packages:
```
pip install -e .
pip install avro
```
Test the installation:
```
cd example
python example.py
```

Connecting to the Database¶

To interact with Kinetica, you must first instantiate an object of the GPUdb class while providing the connection URL, including the host and port of the database server.

import gpudb

""" Establish connection with a locally-running instance of Kinetica,
    using binary encoding to save memory """
h_db = gpudb.GPUdb(encoding='BINARY', host='127.0.0.1', port='9191')

If the server is configured to authenticate users, the username and password will also need to be provided at the time of creating the object:

h_db = gpudb.GPUdb(encoding='BINARY', host='127.0.0.1', port='9191', username='user1', password='mysecret')

Creating a Type¶

Before a table can be created and any data can be loaded into the database, a Type needs to be defined. The type can be defined using one of two methods: a JSON string specifying the schema in raw format, or the more object-oriented GPUdbRecordType object form.

JSON String¶

A JSON string describing the fields (i.e. columns) of the type along with a name for the type can be passed into your handle. Each field consists of a name and a data type:

my_type='''{
            "type":"record",
            "name":"polygon_point",
            "fields":[
                {"name":"col1","type":"double"},
                {"name":"col2","type":"string"}
            ]
            }'''
response = h_db.create_type(type_definition = my_type, label = 'type_id_1')

You can also pass in column sub-type properties using the following syntax:

response = h_db.create_type(type_definition = my_type, label = 'type_id_1', properties = {"column_name": ["column_property1", "column_property2"], "column_name2": ["column_property1"]})

Note

To read more about column types and properties, see Types.

The returned object from the create_type call contains a unique type identifier allocated by the system.

GPUdbRecordType Object¶

Alternatively, the type can be created using the GPUdbRecordType object class and the GPUdbRecordColumn and GPUdbColumnProperty helper classes:

""" Create columns; column arguments consist of name then type, e.g.,
    gpudb.GPUdbRecordColumn( "column_name", column_type, [column_property1,
    column_property2] ) """
columns = []
columns.append(gpudb.GPUdbRecordColumn("city", gpudb.GPUdbRecordColumn._ColumnType.STRING, [gpudb.GPUdbColumnProperty.CHAR16]))
columns.append(gpudb.GPUdbRecordColumn("state_province", gpudb.GPUdbRecordColumn._ColumnType.STRING, [gpudb.GPUdbColumnProperty.CHAR32]))
columns.append(gpudb.GPUdbRecordColumn("country", gpudb.GPUdbRecordColumn._ColumnType.STRING, [gpudb.GPUdbColumnProperty.CHAR16]))
columns.append(gpudb.GPUdbRecordColumn("x", gpudb.GPUdbRecordColumn._ColumnType.DOUBLE))
columns.append(gpudb.GPUdbRecordColumn("y", gpudb.GPUdbRecordColumn._ColumnType.DOUBLE))
columns.append(gpudb.GPUdbRecordColumn("avg_temp", gpudb.GPUdbRecordColumn._ColumnType.DOUBLE))
columns.append(gpudb.GPUdbRecordColumn("time_zone", gpudb.GPUdbRecordColumn._ColumnType.STRING, [gpudb.GPUdbColumnProperty.CHAR8]))

# Create the type object
weather_record_type = gpudb.GPUdbRecordType(columns, label="weather_record_type")

""" Create the type in the database and save the type ID, needed to create
    a table in the next step """
weather_record_type.create_type(h_db)
weather_type_id = weather_record_type.type_id

Creating a Table¶

After you've created a type and stored the type ID in a variable, you can create a table.

response = h_db.create_table(table_name=weather_table, type_id=weather_type_id)
print "Table created:  {}".format(response['status_info']['status'])

Inserting Data¶

You can insert single records or multiple records using an encoded object list.

Note

The following methods are not the only way to insert records, but they are the simplest.

Key/Value Record¶

In this example, a dictionary object will be used to map key/value pairs to table column values.

# Create ordered dictionary for keys & values of record
datum = collections.OrderedDict()
datum["city"] = "Washington, D.C."
datum["state_province"] = "--"
datum["country"] = "USA"
datum["x"] = -77.016389
datum["y"] = 38.904722
datum["avg_temp"] = 58.5
datum["time_zone"] = "UTC-5"

# Encode record and put into a single element list
single_record = [gpudb.GPUdbRecord(weather_record_type, datum).binary_data]

# Insert the record into the table
response = h_db.insert_records(table_name=weather_table, data=single_record, list_encoding="binary", options={})
print "Number of single records inserted:  {}".format(response["count_inserted"])

In-line Record¶

In this example, records will be defined in-line and added to a list.

# Create a list of encoded in-line records
encoded_obj_list = []
encoded_obj_list.append(gpudb.GPUdbRecord(weather_record_type, ["Paris", "TX", "USA", -95.547778, 33.6625, 64.6, "UTC-6"]).binary_data)
encoded_obj_list.append(gpudb.GPUdbRecord(weather_record_type, ["Memphis", "TN", "USA", -89.971111, 35.1175, 63, "UTC-6"]).binary_data)
encoded_obj_list.append(gpudb.GPUdbRecord(weather_record_type, ["Sydney", "Nova Scotia", "Canada", -60.19551, 46.13631, 44.5, "UTC-4"]).binary_data)
encoded_obj_list.append(gpudb.GPUdbRecord(weather_record_type, ["La Paz", "Baja California Sur", "Mexico", -110.310833, 24.142222, 77, "UTC-7"]).binary_data)
encoded_obj_list.append(gpudb.GPUdbRecord(weather_record_type, ["St. Petersburg", "FL", "USA", -82.64, 27.773056, 74.5, "UTC-5"]).binary_data)
encoded_obj_list.append(gpudb.GPUdbRecord(weather_record_type, ["Oslo", "--", "Norway", 10.75, 59.95, 45.5, "UTC+1"]).binary_data)
encoded_obj_list.append(gpudb.GPUdbRecord(weather_record_type, ["Paris", "--", "France", 2.3508, 48.8567, 56.5, "UTC+1"]).binary_data)
encoded_obj_list.append(gpudb.GPUdbRecord(weather_record_type, ["Memphis", "--", "Egypt", 31.250833, 29.844722, 73, "UTC+2"]).binary_data)
encoded_obj_list.append(gpudb.GPUdbRecord(weather_record_type, ["St. Petersburg", "--", "Russia", 30.3, 59.95, 43.5, "UTC+3"]).binary_data)
encoded_obj_list.append(gpudb.GPUdbRecord(weather_record_type, ["Lagos", "Lagos", "Nigeria", 3.384082, 6.455027, 83, "UTC+1"]).binary_data)
encoded_obj_list.append(gpudb.GPUdbRecord(weather_record_type, ["La Paz", "Pedro Domingo Murillo", "Bolivia", -68.15, -16.5, 44, "UTC-4"]).binary_data)
encoded_obj_list.append(gpudb.GPUdbRecord(weather_record_type, ["Sao Paulo", "Sao Paulo", "Brazil", -46.633333, -23.55, 69.5, "UTC-3"]).binary_data)
encoded_obj_list.append(gpudb.GPUdbRecord(weather_record_type, ["Santiago", "Santiago Province", "Chile", -70.666667, -33.45, 62, "UTC-4"]).binary_data)
encoded_obj_list.append(gpudb.GPUdbRecord(weather_record_type, ["Buenos Aires", "--", "Argentina", -58.381667, -34.603333, 65, "UTC-3"]).binary_data)
encoded_obj_list.append(gpudb.GPUdbRecord(weather_record_type, ["Manaus", "Amazonas", "Brazil", -60.016667, -3.1, 83.5, "UTC-4"]).binary_data)
encoded_obj_list.append(gpudb.GPUdbRecord(weather_record_type, ["Sydney", "New South Wales", "Australia", 151.209444, -33.865, 63.5, "UTC+10"]).binary_data)
encoded_obj_list.append(gpudb.GPUdbRecord(weather_record_type, ["Auckland", "--", "New Zealand", 174.74, -36.840556, 60.5, "UTC+12"]).binary_data)
encoded_obj_list.append(gpudb.GPUdbRecord(weather_record_type, ["Jakarta", "--", "Indonesia", 106.816667, -6.2, 83, "UTC+7"]).binary_data)
encoded_obj_list.append(gpudb.GPUdbRecord(weather_record_type, ["Hobart", "--", "Tasmania", 147.325, -42.880556, 56, "UTC+10"]).binary_data)
encoded_obj_list.append(gpudb.GPUdbRecord(weather_record_type, ["Perth", "Western Australia", "Australia", 115.858889, -31.952222, 68, "UTC+8"]).binary_data)

# Insert the records into the table
response = h_db.insert_records(table_name=weather_table, data=encoded_obj_list, list_encoding="binary", options={})
print "Number of batch records inserted:  {}".format(response["count_inserted"])

Retrieving Data¶

Once the table is populated with data, the data can be retrieved as JSON from the system (by a call to get_records) as shown below:

""" Retrieve no more than 10 records as JSON from weather_table. Note that
    records are stringified and have to be parsed. """
weatherLocs = h_db.get_records(table_name=weather_table, offset=0, limit=10, encoding="json", options={"sort_by":"city"})['records_json']

print "{:<20s} {:<25s} {:<15s} {:<10s} {:<11s} {:<9s} {:<8s}".format("City","State/Province","Country","Latitude","Longitude","Avg. Temp","Time Zone")
print "{:=<20s} {:=<25s} {:=<15s} {:=<10s} {:=<11s} {:=<9s} {:=<9s}".format("", "", "", "", "", "", "")
for weatherLoc in weatherLocs:
    print "{city:<20s} {state_province:<25s} {country:<15s} {y:10.6f} {x:11.6f} {avg_temp:9.1f}   {time_zone}".format(**json.loads(weatherLoc))

The data can also be retrieved using binary encoding, which is much faster than JSON encoding.

""" Retrieve no more than 25 of the remaining records as binary from weather
    table. Note that records are binary and have to be decoded. """
response = h_db.get_records(table_name=weather_table, offset=10, limit=25, encoding="binary", options={"sort_by":"city"})
weatherLocs = gpudb.GPUdbRecord.decode_binary_data(response["type_schema"], response["records_binary"])

for weatherLoc in weatherLocs:
    print "{city:<20s} {state_province:<25s} {country:<15s} {y:10.6f} {x:11.6f} {avg_temp:9.1f}   {time_zone}".format(**weatherLoc)

""" Note that total_number_of_records does not reflect offset/limit; it's
    the count of all records or those which match the given expression """
print "\nNumber of records in new table:  {:d}".format(response["total_number_of_records"])

For large tables, the data can be easily be retrieved in smaller blocks by using the offset and limit parameters, as shown above. The returned response also contains the schema (or data type) of the results.

Running Queries¶

Filters¶

Filters are an easy way to filter down a larger table into more concise views.

Filter Example 1¶

""" Filter records where column x is less than 0, i.e., cities in the
    western hemisphere, and store the filter in a view """
response = h_db.filter(table_name=weather_table, view_name=weather_w_view, expression="x < 0")
print "Number of records in the western hemisphere:  {}".format(response["count"])

Filter Example 2¶

""" Filter records where column x is less than 0 and column y is greater
    than 0, i.e., cities in the northwestern semi-hemisphere, and store
    the filter in a view """
response = h_db.filter(table_name=weather_table, view_name=weather_nw_view, expression="x < 0 and y > 0")
print "Number of records in the northwestern semi-hemisphere:  {}".format(response["count"])

Filter Example 3¶

""" Filter records using the same expressions as Example 2, but using
    query chaining this time """

# Clear the view, as it will be recreated with chained query
h_db.clear_table(weather_nw_view)

response = h_db.filter(table_name=weather_w_view, view_name=weather_nw_view, expression="y > 0")
print "Number of records in the northwestern semi-hemisphere (with query chaining):  {}".format(response["count"])

Filter Example 4¶

""" Filter by list where country name is USA, Brazil, or Australia """
country_map = {"country": ["USA", "Brazil", "Australia"]}
response = h_db.filter_by_list(table_name=weather_table, view_name=weather_country_view, column_values_map=country_map)
print "Number of records where country name is USA, Brazil, or Australia:  {}".format(response["count"])

Filter Example 5¶

""" Filter by range cities that are east of GMT (the Prime Meridian) """
response = h_db.filter_by_range(table_name=weather_table, view_name=weather_e_view, column_name="x", lower_bound=0, upper_bound=180)
print "Number of records that are east of the Prime Meridian (x > 0):  {}".format(response["count"])

Aggregates¶

Kinetica supports various aggregate and group-by queries, which group and aggregate your data to return counts and useful statistics.

Aggregate Example 1¶

""" Aggregate count, min, mean, and max on the average temperature """
response = h_db.aggregate_statistics(table_name=weather_table, column_name="avg_temp", stats="count,min,max,mean")
print "Statistics of values in the average temperature column:"
print "\tCount: {count:.0f}\n\tMin:  {min:4.2f}\n\tMean: {mean:4.2f}\n\tMax:  {max:4.2f}\n".format(**response["stats"])

Aggregate Example 2¶

""" Find unique city names """
response = h_db.aggregate_unique(table_name=weather_table, column_name="city", offset=0, limit=25, encoding="json")
print "Unique city names:"
weatherLocs = h_db.parse_dynamic_response(response)['response']['city']
for weatherLoc in weatherLocs:
    print "\t* {}".format(weatherLoc)

Aggregate Example 3¶

""" Find number of weather locations per country in the northwestern
    semi-hemisphere; use binary decoding """
response = h_db.aggregate_group_by(table_name=weather_nw_view, column_names=["country"], offset=0, limit=25, encoding="binary")
countries = gpudb.GPUdbRecord.decode_binary_data(response["response_schema_str"], response["binary_encoded_response"])
print "Weather locations per country in the northwest semi-hemisphere:"
for country in zip(countries["column_1"], countries["column_2"]):
    print "\t{:<10s}{:2d}".format(country[0] + ":", country[1])

Aggregate Example 4¶

""" Filter table to southeastern semi-hemisphere records, group by country,
    and aggregate min, max, and mean on the average temperature; use JSON
    decoding """
h_db.filter(table_name=weather_table, view_name=weather_se_view, expression="x > 0 and y < 0")

response = h_db.aggregate_group_by(table_name=weather_se_view, column_names=["country", "min(avg_temp)", "max(avg_temp)", "mean(avg_temp)"], offset=0, limit=25, encoding="json")
data = h_db.parse_dynamic_response(response)['response']
print "{:<20s} {:^5s} {:^5s} {:^5s}".format("SE Semi-Hemi Country", "Min", "Mean", "Max")
print "{:=<20s} {:=<5s} {:=<5s} {:=<5s}".format("", "", "", "")
for countryWeather in zip(data["country"], data["min(avg_temp)"], data["mean(avg_temp)"], data["max(avg_temp)"]):
    print "{:<20s} {:5.2f} {:5.2f} {:5.2f}".format(*countryWeather)

Aggregate Example 5¶

""" Filter for southern hemisphere cities and create a histogram for the
    average temperature of those cities (divided into every 10 degrees,
    e.g., 40s, 50s, 60s, etc.) """
h_db.filter(table_name=weather_table, view_name=weather_h_view, expression="y < 0")

histogram_result = h_db.aggregate_histogram(table_name=weather_h_view, column_name="avg_temp", start=40, end=90, interval=10)
print "Number of southern hemisphere cities with average temps in the given ranges:"
for histogroup in zip([40, 50, 60, 70, 80], histogram_result['counts']):
    print "\t{}s: {:2.0f}".format(*histogroup)

Deleting Records¶

You can delete records from a table using filter expressions. This method allows you to specify multiple filter expressions--note that each expression is used to delete records independently from the others (i.e., a record only needs to meet any one expression's criteria to be deleted from the table).

""" Filter for cities that are either south of latitude -50 or west of
    longitude -50 to determine how many records will be deleted; delete
    the records, then confirm the deletion by refiltering """
deleteExpression = "x < -50 or y < -50"

delete_filter_result = h_db.filter(table_name=weather_table, expression=deleteExpression)
print "Number of records that meet deletion criteria before deleting:  {}".format(delete_filter_result['count'])

h_db.delete_records(table_name = weather_table, expressions=[deleteExpression])

delete_filter_result = h_db.filter(table_name=weather_table, expression=deleteExpression)
print "Number of records that meet deletion criteria after deleting:  {}".format(delete_filter_result['count'])

Complete Sample¶

Included below is a complete sample program containing all the above queries:

"""This script walks through how to use the Python API.

Covered here: importing GPUdb, instantiating Kinetica, creating a type,
creating a table, inserting records, retrieving records, filtering records,
aggregating/grouping records, and deleting records.
"""

import collections
import json
import gpudb


def gpudb_example():
    
    print "TUTORIAL OUTPUT"
    print "===============\n"

    # all tables/views used in examples below
    weather_table = "weather"
    weather_w_view = "weather_west"
    weather_nw_view = "weather_northwest"
    weather_country_view = "weather_country"
    weather_e_view = "weather_east"
    weather_se_view = "weather_southeast"
    weather_h_view = "weather_histogram"

    """ Establish connection with a locally-running instance of Kinetica,
        using binary encoding to save memory """
    h_db = gpudb.GPUdb(encoding='BINARY', host='127.0.0.1', port='9191')

    print
    print "CREATING A TYPE & TABLE"
    print "-----------------------"
    print

    """ Create columns; column arguments consist of name then type, e.g.,
        gpudb.GPUdbRecordColumn( "column_name", column_type, [column_property1,
        column_property2] ) """
    columns = []
    columns.append(gpudb.GPUdbRecordColumn("city", gpudb.GPUdbRecordColumn._ColumnType.STRING, [gpudb.GPUdbColumnProperty.CHAR16]))
    columns.append(gpudb.GPUdbRecordColumn("state_province", gpudb.GPUdbRecordColumn._ColumnType.STRING, [gpudb.GPUdbColumnProperty.CHAR32]))
    columns.append(gpudb.GPUdbRecordColumn("country", gpudb.GPUdbRecordColumn._ColumnType.STRING, [gpudb.GPUdbColumnProperty.CHAR16]))
    columns.append(gpudb.GPUdbRecordColumn("x", gpudb.GPUdbRecordColumn._ColumnType.DOUBLE))
    columns.append(gpudb.GPUdbRecordColumn("y", gpudb.GPUdbRecordColumn._ColumnType.DOUBLE))
    columns.append(gpudb.GPUdbRecordColumn("avg_temp", gpudb.GPUdbRecordColumn._ColumnType.DOUBLE))
    columns.append(gpudb.GPUdbRecordColumn("time_zone", gpudb.GPUdbRecordColumn._ColumnType.STRING, [gpudb.GPUdbColumnProperty.CHAR8]))

    # Create the type object
    weather_record_type = gpudb.GPUdbRecordType(columns, label="weather_record_type")

    """ Create the type in the database and save the type ID, needed to create
        a table in the next step """
    weather_record_type.create_type(h_db)
    weather_type_id = weather_record_type.type_id

    # Create the table from the type
    if h_db.has_table(table_name=weather_table)['table_exists']:
        h_db.clear_table(weather_table)
    response = h_db.create_table(table_name=weather_table, type_id=weather_type_id)
    print "Table created:  {}".format(response['status_info']['status'])

    print "\n"
    print "INSERTING DATA"
    print "--------------"
    print

    # Insert single record example

    # Create ordered dictionary for keys & values of record
    datum = collections.OrderedDict()
    datum["city"] = "Washington, D.C."
    datum["state_province"] = "--"
    datum["country"] = "USA"
    datum["x"] = -77.016389
    datum["y"] = 38.904722
    datum["avg_temp"] = 58.5
    datum["time_zone"] = "UTC-5"

    # Encode record and put into a single element list
    single_record = [gpudb.GPUdbRecord(weather_record_type, datum).binary_data]

    # Insert the record into the table
    response = h_db.insert_records(table_name=weather_table, data=single_record, list_encoding="binary", options={})
    print "Number of single records inserted:  {}".format(response["count_inserted"])

    # Insert multiple records example

    # Create a list of encoded in-line records
    encoded_obj_list = []
    encoded_obj_list.append(gpudb.GPUdbRecord(weather_record_type, ["Paris", "TX", "USA", -95.547778, 33.6625, 64.6, "UTC-6"]).binary_data)
    encoded_obj_list.append(gpudb.GPUdbRecord(weather_record_type, ["Memphis", "TN", "USA", -89.971111, 35.1175, 63, "UTC-6"]).binary_data)
    encoded_obj_list.append(gpudb.GPUdbRecord(weather_record_type, ["Sydney", "Nova Scotia", "Canada", -60.19551, 46.13631, 44.5, "UTC-4"]).binary_data)
    encoded_obj_list.append(gpudb.GPUdbRecord(weather_record_type, ["La Paz", "Baja California Sur", "Mexico", -110.310833, 24.142222, 77, "UTC-7"]).binary_data)
    encoded_obj_list.append(gpudb.GPUdbRecord(weather_record_type, ["St. Petersburg", "FL", "USA", -82.64, 27.773056, 74.5, "UTC-5"]).binary_data)
    encoded_obj_list.append(gpudb.GPUdbRecord(weather_record_type, ["Oslo", "--", "Norway", 10.75, 59.95, 45.5, "UTC+1"]).binary_data)
    encoded_obj_list.append(gpudb.GPUdbRecord(weather_record_type, ["Paris", "--", "France", 2.3508, 48.8567, 56.5, "UTC+1"]).binary_data)
    encoded_obj_list.append(gpudb.GPUdbRecord(weather_record_type, ["Memphis", "--", "Egypt", 31.250833, 29.844722, 73, "UTC+2"]).binary_data)
    encoded_obj_list.append(gpudb.GPUdbRecord(weather_record_type, ["St. Petersburg", "--", "Russia", 30.3, 59.95, 43.5, "UTC+3"]).binary_data)
    encoded_obj_list.append(gpudb.GPUdbRecord(weather_record_type, ["Lagos", "Lagos", "Nigeria", 3.384082, 6.455027, 83, "UTC+1"]).binary_data)
    encoded_obj_list.append(gpudb.GPUdbRecord(weather_record_type, ["La Paz", "Pedro Domingo Murillo", "Bolivia", -68.15, -16.5, 44, "UTC-4"]).binary_data)
    encoded_obj_list.append(gpudb.GPUdbRecord(weather_record_type, ["Sao Paulo", "Sao Paulo", "Brazil", -46.633333, -23.55, 69.5, "UTC-3"]).binary_data)
    encoded_obj_list.append(gpudb.GPUdbRecord(weather_record_type, ["Santiago", "Santiago Province", "Chile", -70.666667, -33.45, 62, "UTC-4"]).binary_data)
    encoded_obj_list.append(gpudb.GPUdbRecord(weather_record_type, ["Buenos Aires", "--", "Argentina", -58.381667, -34.603333, 65, "UTC-3"]).binary_data)
    encoded_obj_list.append(gpudb.GPUdbRecord(weather_record_type, ["Manaus", "Amazonas", "Brazil", -60.016667, -3.1, 83.5, "UTC-4"]).binary_data)
    encoded_obj_list.append(gpudb.GPUdbRecord(weather_record_type, ["Sydney", "New South Wales", "Australia", 151.209444, -33.865, 63.5, "UTC+10"]).binary_data)
    encoded_obj_list.append(gpudb.GPUdbRecord(weather_record_type, ["Auckland", "--", "New Zealand", 174.74, -36.840556, 60.5, "UTC+12"]).binary_data)
    encoded_obj_list.append(gpudb.GPUdbRecord(weather_record_type, ["Jakarta", "--", "Indonesia", 106.816667, -6.2, 83, "UTC+7"]).binary_data)
    encoded_obj_list.append(gpudb.GPUdbRecord(weather_record_type, ["Hobart", "--", "Tasmania", 147.325, -42.880556, 56, "UTC+10"]).binary_data)
    encoded_obj_list.append(gpudb.GPUdbRecord(weather_record_type, ["Perth", "Western Australia", "Australia", 115.858889, -31.952222, 68, "UTC+8"]).binary_data)

    # Insert the records into the table
    response = h_db.insert_records(table_name=weather_table, data=encoded_obj_list, list_encoding="binary", options={})
    print "Number of batch records inserted:  {}".format(response["count_inserted"])

    print "\n"
    print "RETRIEVING DATA"
    print "---------------"
    print

    """ Retrieve no more than 10 records as JSON from weather_table. Note that
        records are stringified and have to be parsed. """
    weatherLocs = h_db.get_records(table_name=weather_table, offset=0, limit=10, encoding="json", options={"sort_by":"city"})['records_json']

    print "{:<20s} {:<25s} {:<15s} {:<10s} {:<11s} {:<9s} {:<8s}".format("City","State/Province","Country","Latitude","Longitude","Avg. Temp","Time Zone")
    print "{:=<20s} {:=<25s} {:=<15s} {:=<10s} {:=<11s} {:=<9s} {:=<9s}".format("", "", "", "", "", "", "")
    for weatherLoc in weatherLocs:
        print "{city:<20s} {state_province:<25s} {country:<15s} {y:10.6f} {x:11.6f} {avg_temp:9.1f}   {time_zone}".format(**json.loads(weatherLoc))

    """ Retrieve no more than 25 of the remaining records as binary from weather
        table. Note that records are binary and have to be decoded. """
    response = h_db.get_records(table_name=weather_table, offset=10, limit=25, encoding="binary", options={"sort_by":"city"})
    weatherLocs = gpudb.GPUdbRecord.decode_binary_data(response["type_schema"], response["records_binary"])

    for weatherLoc in weatherLocs:
        print "{city:<20s} {state_province:<25s} {country:<15s} {y:10.6f} {x:11.6f} {avg_temp:9.1f}   {time_zone}".format(**weatherLoc)

    """ Note that total_number_of_records does not reflect offset/limit; it's
        the count of all records or those which match the given expression """
    print "\nNumber of records in new table:  {:d}".format(response["total_number_of_records"])

    print "\n"
    print "FILTERING"
    print "---------"
    print

    ### Filter Example 1
    
    """ Filter records where column x is less than 0, i.e., cities in the
        western hemisphere, and store the filter in a view """
    response = h_db.filter(table_name=weather_table, view_name=weather_w_view, expression="x < 0")
    print "Number of records in the western hemisphere:  {}".format(response["count"])

    ### Filter Example 2
    
    """ Filter records where column x is less than 0 and column y is greater
        than 0, i.e., cities in the northwestern semi-hemisphere, and store
        the filter in a view """
    response = h_db.filter(table_name=weather_table, view_name=weather_nw_view, expression="x < 0 and y > 0")
    print "Number of records in the northwestern semi-hemisphere:  {}".format(response["count"])

    ### Filter Example 3
    
    """ Filter records using the same expressions as Example 2, but using
        query chaining this time """

    # Clear the view, as it will be recreated with chained query
    h_db.clear_table(weather_nw_view)

    response = h_db.filter(table_name=weather_w_view, view_name=weather_nw_view, expression="y > 0")
    print "Number of records in the northwestern semi-hemisphere (with query chaining):  {}".format(response["count"])

    ### Filter Example 4
    
    """ Filter by list where country name is USA, Brazil, or Australia """
    country_map = {"country": ["USA", "Brazil", "Australia"]}
    response = h_db.filter_by_list(table_name=weather_table, view_name=weather_country_view, column_values_map=country_map)
    print "Number of records where country name is USA, Brazil, or Australia:  {}".format(response["count"])

    ### Filter Example 5
    
    """ Filter by range cities that are east of GMT (the Prime Meridian) """
    response = h_db.filter_by_range(table_name=weather_table, view_name=weather_e_view, column_name="x", lower_bound=0, upper_bound=180)
    print "Number of records that are east of the Prime Meridian (x > 0):  {}".format(response["count"])


    print "\n"
    print "AGGREGATING, GROUPING, and HISTOGRAMS"
    print "-------------------------------------"
    print

    ### Aggregate Example 1
    
    """ Aggregate count, min, mean, and max on the average temperature """
    response = h_db.aggregate_statistics(table_name=weather_table, column_name="avg_temp", stats="count,min,max,mean")
    print "Statistics of values in the average temperature column:"
    print "\tCount: {count:.0f}\n\tMin:  {min:4.2f}\n\tMean: {mean:4.2f}\n\tMax:  {max:4.2f}\n".format(**response["stats"])

    ### Aggregate Example 2
    
    """ Find unique city names """
    response = h_db.aggregate_unique(table_name=weather_table, column_name="city", offset=0, limit=25, encoding="json")
    print "Unique city names:"
    weatherLocs = h_db.parse_dynamic_response(response)['response']['city']
    for weatherLoc in weatherLocs:
        print "\t* {}".format(weatherLoc)
    print

    ### Aggregate Example 3
    
    """ Find number of weather locations per country in the northwestern
        semi-hemisphere; use binary decoding """
    response = h_db.aggregate_group_by(table_name=weather_nw_view, column_names=["country"], offset=0, limit=25, encoding="binary")
    countries = gpudb.GPUdbRecord.decode_binary_data(response["response_schema_str"], response["binary_encoded_response"])
    print "Weather locations per country in the northwest semi-hemisphere:"
    for country in zip(countries["column_1"], countries["column_2"]):
        print "\t{:<10s}{:2d}".format(country[0] + ":", country[1])
    print

    ### Aggregate Example 4
    
    """ Filter table to southeastern semi-hemisphere records, group by country,
        and aggregate min, max, and mean on the average temperature; use JSON
        decoding """
    h_db.filter(table_name=weather_table, view_name=weather_se_view, expression="x > 0 and y < 0")

    response = h_db.aggregate_group_by(table_name=weather_se_view, column_names=["country", "min(avg_temp)", "max(avg_temp)", "mean(avg_temp)"], offset=0, limit=25, encoding="json")
    data = h_db.parse_dynamic_response(response)['response']
    print "{:<20s} {:^5s} {:^5s} {:^5s}".format("SE Semi-Hemi Country", "Min", "Mean", "Max")
    print "{:=<20s} {:=<5s} {:=<5s} {:=<5s}".format("", "", "", "")
    for countryWeather in zip(data["country"], data["min(avg_temp)"], data["mean(avg_temp)"], data["max(avg_temp)"]):
        print "{:<20s} {:5.2f} {:5.2f} {:5.2f}".format(*countryWeather)
    print

    ### Aggregate Example 5
    
    """ Filter for southern hemisphere cities and create a histogram for the
        average temperature of those cities (divided into every 10 degrees,
        e.g., 40s, 50s, 60s, etc.) """
    h_db.filter(table_name=weather_table, view_name=weather_h_view, expression="y < 0")

    histogram_result = h_db.aggregate_histogram(table_name=weather_h_view, column_name="avg_temp", start=40, end=90, interval=10)
    print "Number of southern hemisphere cities with average temps in the given ranges:"
    for histogroup in zip([40, 50, 60, 70, 80], histogram_result['counts']):
        print "\t{}s: {:2.0f}".format(*histogroup)

    print "\n"
    print "DELETING DATA"
    print "-------------"
    print

    """ Filter for cities that are either south of latitude -50 or west of
        longitude -50 to determine how many records will be deleted; delete
        the records, then confirm the deletion by refiltering """
    deleteExpression = "x < -50 or y < -50"

    delete_filter_result = h_db.filter(table_name=weather_table, expression=deleteExpression)
    print "Number of records that meet deletion criteria before deleting:  {}".format(delete_filter_result['count'])

    h_db.delete_records(table_name = weather_table, expressions=[deleteExpression])

    delete_filter_result = h_db.filter(table_name=weather_table, expression=deleteExpression)
    print "Number of records that meet deletion criteria after deleting:  {}".format(delete_filter_result['count'])

# end gpudb_example()


if __name__ == '__main__':
    gpudb_example()

Sample Output¶

After running the above sample script, the following is output:

TUTORIAL OUTPUT
===============


CREATING A TYPE & TABLE
-----------------------

Table created:  OK


INSERTING DATA
--------------

Number of single records inserted:  1
Number of batch records inserted:  20


RETRIEVING DATA
---------------

City                 State/Province            Country         Latitude   Longitude   Avg. Temp Time Zone
==================== ========================= =============== ========== =========== ========= =========
Auckland             --                        New Zealand     -36.840556  174.740000      60.5   UTC+12
Buenos Aires         --                        Argentina       -34.603333  -58.381667      65.0   UTC-3
Hobart               --                        Tasmania        -42.880556  147.325000      56.0   UTC+10
Jakarta              --                        Indonesia        -6.200000  106.816667      83.0   UTC+7
La Paz               Baja California Sur       Mexico           24.142222 -110.310833      77.0   UTC-7
La Paz               Pedro Domingo Murillo     Bolivia         -16.500000  -68.150000      44.0   UTC-4
Lagos                Lagos                     Nigeria           6.455027    3.384082      83.0   UTC+1
Manaus               Amazonas                  Brazil           -3.100000  -60.016667      83.5   UTC-4
Memphis              TN                        USA              35.117500  -89.971111      63.0   UTC-6
Memphis              --                        Egypt            29.844722   31.250833      73.0   UTC+2
Oslo                 --                        Norway           59.950000   10.750000      45.5   UTC+1
Paris                TX                        USA              33.662500  -95.547778      64.6   UTC-6
Paris                --                        France           48.856700    2.350800      56.5   UTC+1
Perth                Western Australia         Australia       -31.952222  115.858889      68.0   UTC+8
Santiago             Santiago Province         Chile           -33.450000  -70.666667      62.0   UTC-4
Sao Paulo            Sao Paulo                 Brazil          -23.550000  -46.633333      69.5   UTC-3
St. Petersburg       FL                        USA              27.773056  -82.640000      74.5   UTC-5
St. Petersburg       --                        Russia           59.950000   30.300000      43.5   UTC+3
Sydney               Nova Scotia               Canada           46.136310  -60.195510      44.5   UTC-4
Sydney               New South Wales           Australia       -33.865000  151.209444      63.5   UTC+10
Washington, D.C.     --                        USA              38.904722  -77.016389      58.5   UTC-5

Number of records in new table:  21


FILTERING
---------

Number of records in the western hemisphere:  11
Number of records in the northwestern semi-hemisphere:  6
Number of records in the northwestern semi-hemisphere (with query chaining):  6
Number of records where country name is USA, Brazil, or Australia:  8
Number of records that are east of the Prime Meridian (x > 0):  10


AGGREGATING, GROUPING, and HISTOGRAMS
-------------------------------------

Statistics of values in the average temperature column:
	Count: 21
	Min:  43.50
	Mean: 63.74
	Max:  83.50

Unique city names:
	* Auckland
	* Buenos Aires
	* Hobart
	* Jakarta
	* La Paz
	* Lagos
	* Manaus
	* Memphis
	* Oslo
	* Paris
	* Perth
	* Santiago
	* Sao Paulo
	* St. Petersburg
	* Sydney
	* Washington, D.C.

Weather locations per country in the northwest semi-hemisphere:
	Canada:    1
	Mexico:    1
	USA:       4

SE Semi-Hemi Country  Min  Mean   Max 
==================== ===== ===== =====
Tasmania             56.00 56.00 56.00
New Zealand          60.50 60.50 60.50
Australia            63.50 65.75 68.00
Indonesia            83.00 83.00 83.00

Number of southern hemisphere cities with average temps in the given ranges:
	40s:  1
	50s:  1
	60s:  6
	70s:  0
	80s:  2


DELETING DATA
-------------

Number of records that meet deletion criteria before deleting:  10
Number of records that meet deletion criteria after deleting:  0

Table Of Contents

Python Developer Manual¶

API Download and Installation¶

Download¶

Using Git¶

Manually¶

Installation¶

Connecting to the Database¶

Creating a Type¶

JSON String¶

GPUdbRecordType Object¶

Creating a Table¶

Inserting Data¶

Key/Value Record¶

In-line Record¶

Retrieving Data¶

Running Queries¶

Filters¶

Filter Example 1¶

Filter Example 2¶

Filter Example 3¶

Filter Example 4¶

Filter Example 5¶

Aggregates¶

Aggregate Example 1¶

Aggregate Example 2¶

Aggregate Example 3¶

Aggregate Example 4¶

Aggregate Example 5¶

Deleting Records¶

Complete Sample¶

Sample Output¶