Geospatial Partitioning
Geospatial partitioning is a method for organizing geospatial data, grouping spatial entities in close proximity to each other. This geospatial grouping allows for accelerated geospatial processing of the grouped data. Geospatial joins, specifically, will take advantage of this process improvement. An increase in performance using this technique will be proportional to the size of the data being joined.Example
The goal of the geospatial join in this example is to show flights that landed at (or passed over) JFK airport. It will do this by joining geolocation data sampled at various points of disparate airline flight paths with neighborhood zone data from the NYC NTA data set, and specifically, the JFK airport zones. This example relies on the flights data set, which can be imported into Kinetica via GAdmin. That data will be copied to two tables, one with no geopartitioning and one with geopartitioning, and then increased in size by a magnitude of order to show the performance difference between querying the two. The zone data, which includes JFK airport, will come from the NYC Neighborhood data file.Setup
First, create the two flight data tables, with and without geopartitioning. The partitioned table is partitioned on the geohash of the flight path data points with a precision of 2, which partitions the data into 135 different groups.Geo-Join
Now that the tables are ready, the same geospatial join can be run, one using the geospatially partitioned table, and one using the non-partitioned table.Without Geopartitioning
Non-partitioned query:With Geopartitioning
Partitioned query: