KIO provides the ability to import data to and export data from Kinetica; it comes pre-installed and ready to use. KIO can transfer data between external sources directly, bypassing Kinetica if necessary. A simpler, GUI version of KIO is available in the the KIO Tool section of GAdmin.
KIO can be accessed via command line from /opt/gpudb/bin/kio
Available Options:
Parameter | Description |
---|---|
-h , --help |
Displays the help menu |
-v , --version |
Displays the version |
-s <uri> , --source <uri> |
The source URI. Use --show-usage to see more information about the URI types for supported data
formats. |
-d <uri> , --destination <uri> |
The destination URI. Use --show-usage to see more information about the URI types for supported data
formats. |
--show-usage <format> |
Displays detailed usage information for the given format. See the list of formats for more information about the format and format-specific options. |
--failure-mode <mode> |
Change the behavior of KIO upon encountering data parsing errors. There are two modes:
|
--batch-size <num> |
Set the number of records to be fetched at a time from the given source. Maximum of When using
When using |
--engine <type> |
Change the underlying engine used to ingest records from a given source into Kinetica. The following engines are available:
|
--spark-network-timeout <seconds> |
Set a network timeout in seconds. This is useful for regulating Spark executor heartbeat failures.
Default is 800 seconds. |
--spark-driver-memory <memory> |
Set the Spark driver memory usage limit. Default is 2 GB. |
--spark-executor-memory <memory> |
Set the Spark executor memory usage limit. Default is 2 GB. |
--spark-offheap-memory <memory> |
Set the Spark offheap memory usage limit. Default is 4 GB. |
--log-level <level> |
Set the log level for the KIO transaction. Valid log levels:
Tip The |
--log-file <file-path> |
Set the file to write logs to for the KIO transaction. If the file does not exist, it will be created. If the file does exist, logs will be appended to the file. By default, all logs are directed to standard output. Tip The |
--d-column-properties <list> |
Enables overriding destination column properties for Kinetica to PostgreSQL transfers. The column properties list is passed in as a stringified JSON list of dictionaries, e.g.: [{"column_name1": "<name>", "column_base_type": "<type>"}, {"column_name2" ...}, ...]
Note Columns of |
The following list details the available formats for each type.
Database formats:
File formats:
The following sections detail which engines can support transferring data to (import) or from (export) Kinetica.
Important
Both the Spark and Odo engines can import from other Kinetica tables.
The Spark engine can transfer data from the following data format types to Kinetica:
The Odo engine can transfer data from the following data format types to Kinetica:
Important
Both the Spark and Odo engines can export to a Kinetica table.
The Spark engine can transfer data from Kinetica to the following data format types:
The Odo engine can transfer data from Kinetica to the following data format types:
Apache ORC
Apache Parquet (local)
Important
The source data cannot be transferred to a local parquet file
if the data contains bytes
or timestamp
data
CSV (AWS S3, local)
Microsoft SQL Server
Oracle
PostgreSQL
Shapefile (AWS S3, local)
Because the supported formats can differ drastically from one another, KIO has to normalize the types of data being input so it can be output to the destination format appropriately. The table below describes the type inferred by KIO from each source format.
Source | Source Type | KIO Inferred Type |
---|---|---|
CSV | Numerical value without decimal places, e.g., 123456 |
int64 |
Numerical value with decimal places, e.g., 123.456 |
float64 |
|
Value containing string characters or symbols (A-Z , a-z ,
- , etc.) |
string |
|
Time value, e.g., 23:45:15.848 |
datetime |
|
Date (2008-04-15 ) or datetime (2009-04-16 21:00:15.325 )
value |
string |
|
Empty or null (or null string) value | string |
Imports from or exports to a table in Kinetica.
Copying from one table to another is possible. For help with command line usage,
provide the --show-usage
parameter with the kinetica
option.
Important
If graceful
failure mode is enabled, KIO will attempt to sanitize
column names to fit within the Kinetica
column name restrictions that could prevent
proper ingest.
Base Kinetica URI:
kinetica://[[<username:password>]@][<host>:<port>]<::table name>
Note
The URI must be wrapped in quotes (''
or ""
) should a special
character be present that could confuse the command prompt
Important
If attempting to connect to Kinetica securely, the trusted root certificate
must be in the OpenSSL trust store. Also, the port portion of the URI
will need to reference the HTTPD proxy port/rank for Kinetica, e.g.,
localhost:8082/gpudb-0
(the default rank is gpubd-0
). See
Security Configuration for more details
on configuring Kinetica securely.
URI Parameters:
Parameter | Description |
---|---|
username |
The username used for authentication. Default is blank |
password |
The password used for authentication. Default is blank |
host |
The host to connect to. Default is localhost |
port |
The port to connect to. Default is Note KIO supports connecting to Kinetica via HTTPD proxy (if enabled). |
table name |
The table in the database to import into or export from. |
Command Line Options:
Note
For more information on the types of engines, see Command Line Usage.
Options | Engine Availability | Description |
---|---|---|
--d-base-type-overrides '<type_definition>' |
Both | Override default base types found in the source when the destination is Kinetica.
Only applicable when importing data into a non-existing table. Values should be stringified JSON objects,
with double quotes --d-base-type-overrides '{"column_name": "int"}'
Values should be a string for a given key, as there is a one-to-one mapping between a column and type. To override the base types for multiple columns, add another key/value pair --d-base-type-overrides '{"column_name": "string", "column_name2": "float"}'
Note The Kinetica Spark Connector infers |
--d-property-additions '<prop_definition>' |
Both | Properties added to the properties found in the source when the
destination is Kinetica. Only applicable when importing data into a non-existing table. Values should be
stringified JSON objects, with double quotes --d-property-additions '{"column_name": ["data", "int16"]}'
Values should be a list of strings for a given key, as there can be multiple property additions for a column. To override the properties for multiple columns, add another key/value pair --d-property-additions '{"column_name": ["char4"], "column_name2": ["data", "int16"]}'
All the properties listed under Column Properties are supported when using The properties listed below are supported when using
Note The Kinetica Spark Connector infers |
--d-property-overrides '<prop_definition>' |
Both | Override default properties found in the source when the destination is
Kinetica. Only applicable when importing data into a non-existing table. Values should be stringified
JSON objects, with double quotes --d-property-overrides '{"column_name": ["data", "int16"]}'
Values should be a list of strings for a given key, as there can be multiple property additions for a column. To override the properties for multiple columns, add another key/value pair --d-property-overrides '{"column_name": ["char4"], "column_name2": ["data", "int16"]}'
Note The Kinetica Spark Connector assumes all columns are nullable
by default, so using the Spark engine will automatically apply the |
--d-dry-run |
odo |
Generates a stringified JSON schema of the specified Kinetica destination table based on the input source. The schema includes the column's name, column's type, any properties, and whether the column is nullable. Both a source and destination must be specified, but note that no data is transferred/appended and non-existing tables will not be created when this flag is present. Returned schema is a list of Python dictionaries, e.g., [{"is_nullable": false, "column_properties":["data", "int16"], "column_type":"int", "column_name":"a"}]
|
--s-https |
spark |
Generates a connection using HTTPS instead of the default HTTP to the specified Kinetica source table. Note An HTTPS connection requires that the appropriate truststore and password be passed as well.
Review the |
--d-https |
spark |
Generates a connection using HTTPS instead of the default HTTP to the specified Kinetica destination table. Note An HTTPS connection requires that the appropriate truststore and password be passed as well.
Review the |
--d-collection-name |
Both | Specifies the collection in which the specified Kinetica destination table should be placed. If the collection does not exist, it will be created. If no collection is specified, the destination table will be placed in the root collection. If the destination table already exists, this option will NOT place the existing table into the specified collection or move the table from its existing collection to the specified collection. |
--d-is-replicated |
spark |
Specifies whether the destination table should be replicated. |
--d-update-on-existing-pk |
spark |
Specifies the record collision policy for inserting into a table with a primary key If this flag is present, any existing table record with primary key values that match those of a record being inserted will be replaced by that new record. If this flag is not present, any existing table record with primary key values that match those of a record being inserted will remain unchanged and the new record discarded. If the specified table does not have a primary key, then this flag is ignored. |
--d-jdbc-url '<jdbc-url>' |
spark |
Specifies a particular JDBC URL to connect to Kinetica. The default JDBC URL is
'jdbc:kinetica://<hostname>:9191' where <hostname> is the hostname referenced in Kinetica URI. |
--d-spark-timeout-milliseconds |
spark |
Set a connection timeout in milliseconds. The default value is 60000 (1 minute). |
--d-spark-enable-multihead |
spark |
Enables multi-head ingest. Important Replicated tables are not supported with multi-head ingest. |
--s-ssl-httpd-truststore
/file/path/truststore.jks |
spark |
When Kinetica is the source, filepath to the Java truststore that is then passed to the HTTPD service to verify the SSL certificate that the Kinetica HTTPD server provides. Note This option must be passed in conjuction with the |
--d-ssl-httpd-truststore
/file/path/truststore.jks |
spark |
When Kinetica is the destination, filepath to the Java truststore that is then passed to the HTTPD service to verify the SSL certificate that the Kinetica HTTPD server provides. Note This option must be passed in conjuction with the |
--s-ssl-httpd-truststore-password
'<password>' |
spark |
When Kinetica is the source, the password for the Java truststore passed with the
--s-ssl-httpd-truststore parameter. |
--d-ssl-httpd-truststore-password
'<password>' |
spark |
When Kinetica is the destination, the password for the Jave truststore passed with the
--d-ssl-httpd-truststore parameter. |
If a table does not exist when importing into Kinetica, the table will be created automatically and:
--d-is-replicated
option with spark
as the --engine
type)--
the table will be randomly sharded by defaultTo remove properties, specify the --d-property-overrides
option and list
all properties that should be kept while leaving out any property(ies) that
should be removed.
Overriding a base type will remove its existing properties.
If a table is being imported into Kinetica using the Spark engine and its name contains non-alphanumeric special characters, the special characters will be replaced with underscores
If importing a GIS-type column(s) with the --gis-ext-enabled
flag, the
--d-property-additions
parameter must be present containing a map of each
GIS-type column to the wkt
type, e.g.,
--d-property-additions '{"gis_col1": ["wkt"], "gis_col2": ["wkt"]}'
Imports from a table in Microsoft SQL Server. For help with
command line usage, provide the --show-usage
parameter with the
sqlserver
option.
Important
Microsoft SQL Server / GIS SQL Server is only supported when importing from Microsoft SQL Server to Kinetica with the Spark engine.
Base Microsoft SQL Server URI:
sqlserver://<username>:<password>@<address><:port>/<database_name><::table_name>
Important
All fields are required. The URI must be wrapped in quotes (''
or ""
)
should a special character be present that could confuse the command prompt
URI Parameters:
Parameter | Description |
---|---|
username |
The username used for authentication. |
password |
The password used for authentication. |
address |
The IPV4 address to connect to. |
port |
The port to connect to. |
database_name |
The database name. |
table_name |
The table in the database to import from. |
Command Line Options:
Option | Description |
---|---|
--gis-ext-enabled |
Allows KIO to recognize a GIS-specific extension is enabled so SQL-native GIS types can be read into memory as wkt strings for
proper conversion into Kinetica. This option must be passed in conjunction with the kinetica destination format option
--d-property-additions to map the incoming GIS-type column(s) to wkt type in Kinetica (see Examples for more
detail). |
Imports from a table in an Oracle database. For help with
command line usage, provide the --show-usage
parameter with the oracle
option.
Important
Oracle is only supported when importing from Oracle to Kinetica with the Spark engine.
Base Oracle URI:
oracle://<username>:<password>@<address><:port>/<system_id><::schema_name>
Important
All fields are required. The URI must be wrapped in quotes (''
or ""
)
should a special character be present that could confuse the command prompt
URI Parameters:
Parameter | Description |
---|---|
username |
The username used for authentication. |
password |
The password used for authentication. |
address |
The IPV4 address to connect to. |
port |
The port to connect to. |
system_id |
The system ID to identify a particular Oracle service. |
schema_name |
The schema in the system to import from. |
KIO uses the Oracle XE Service to establish a database connection. The
listener.ora
and tsnames.ora
files must be correctly configured to
expose the XE Service.
KIO only supports Oracle 12c.
When exporting from Oracle to a given destination, column names will be capitalized. When applying properties to a Kinetica destination table's column names, ensure the column names referenced are also capitalized, e.g.,:
--d-property-additions '{"VENDOR_ID": ["text_search"]}'
Imports from or exports to a table in PostgreSQL. Copying from one table to
another is possible. For help with command line usage, provide the
--show-usage
parameter with the postgresql
option.
Important
PostGIS is only supported when importing from PostgreSQL to Kinetica with the Spark engine.
Base PostgreSQL URI:
postgresql://<username>:<password>@<host><:port>/<database_name>[::schema_name]<::table_name>
Important
The URI must be wrapped in quotes (''
or ""
) should a special
character be present that could confuse the command prompt
URI Parameters:
Parameter | Description |
---|---|
username |
The username used for authentication. Default is the user currently running KIO |
password |
The password used for authentication. |
host |
The host to connect to. Default is localhost |
port |
The port to connect to. Default is 5432 |
database_name |
The database name. Default is the value the username parameter is set to |
schema_name |
The schema name. Optional |
table_name |
The table in the database to import into or export from. |
Command Line Options:
Option | Description |
---|---|
--s-case-sensitive |
When the source is PostgreSQL and this flag is passed, case sensitivity will be enabled for table, column, and schema names. |
--d-case-sensitive |
When the destination is PostgreSQL and this flag is passed, case sensitivity will be enabled for table, column, and schema names. |
--s-postgis-schema
<schema-name> |
The PostGIS schema name for when the source is PostgreSQL. Default value is public . |
--d-postgis-schema
<schema-name> |
The PostGIS schema name for when the destination is PostgreSQL. Default value is public . |
--gis-ext-enabled |
Allows KIO to recognize a GIS-specific extension is enabled so SQL-native GIS types can be read into memory as wkt strings for
proper conversion into Kinetica. This option must be passed in conjunction with the kinetica destination format option
--d-property-additions to map the incoming GIS-type column(s) to wkt type in Kinetica (see Examples for more
detail). |
--s-batch-iterator-column-name
<column-name> |
When the source is PostgreSQL, column name for a column that will be used to iterate over data when ingesting into Kinetica. This
parameter is required for tables that have more than 10 million records, but tables of any size will receive better ingest performance.
The provided column must be an integer ( Important This option is only available for PostgreSQL to Kinetica transfers using the Spark engine. |
Imports from a table in a Sybase IQ database. For help with
command line usage, provide the --show-usage
parameter with the sybaseiq
option.
Important
Sybase IQ is only supported when importing from Sybase IQ to Kinetica with the Spark engine.
Base Sybase IQ URI:
sybaseiq://<username>:<password>@<address><:port>/<database_name><::table_name>
Important
All fields are required. The URI must be wrapped in quotes (''
or ""
)
should a special character be present that could confuse the command prompt
URI Parameters:
Parameter | Description |
---|---|
username |
The username used for authentication. |
password |
The password used for authentication. |
address |
The host to connect to. |
port |
The port to connect to. |
database_name |
The database name. |
table_name |
The table in the database to import from. |
Imports from a table in a Teradata database. For help with
command line usage, provide the --show-usage
parameter with the teradata
option.
Some considerations when connecting to Teradata:
1.3.2
, 5
, 6
, 7
, 8
, or
9
are installed and reachable on your machine instance in order to connect
to Teradata.Important
Teradata is only supported when importing from Teradata to Kinetica with the Spark engine.
Base Teradata URI:
teradata://<username>:<password>@<address><:port>/<database_name><::table_name>
Important
All fields are required. The URI must be wrapped in quotes (''
or ""
)
should a special character be present that could confuse the command prompt
URI Parameters:
Parameter | Description |
---|---|
username |
The username used for authentication. |
password |
The password used for authentication. |
host |
The host to connect to. |
port |
The port to connect to. |
database_name |
The database name. |
table_name |
The table in the database to import from. The name must exactly match; predicate matching is not supported. |
CHARSET
) when connecting with
Teradata.DDSTATS
.ENCRYPTDATA
.NEW_PASSWORD
.Imports from an ORC file; exporting to ORC via KIO is currently not supported.
For help with command line usage, provide the --show-usage
parameter with
the orc
option.
URI Parameters:
Parameter | Description |
---|---|
/file/path/filename.orc |
Absolute filepath to an ORC file |
Important
Do not specify single or double quotes around the file path, as KIO will interpret that as the name of the file, rather than a file path.
Imports from or exports to a Parquet file; however, exporting to Parquet files
in AWS S3 via KIO is currently not supported. Compressed or partitioned
Parquet files are also supported. For help with command line usage, provide the
--show-usage
parameter with the parquet
option.
Important
If graceful
failure mode is enabled, KIO will not validate the schema
of the dataset until time of ingest for a particular partition at which
point the partition will be skipped if its schema is invalid.
Tip
Apache Parquet files can also transferred to/from an S3 bucket. See Transferring files via S3 for details.
URI Parameters:
Parameter | Description |
---|---|
/file/path/filename.parquet |
Absolute filepath to a Parquet file |
Important
Do not specify single or double quotes around the file path, as KIO will interpret that as the name of the file, rather than a file path.
Command Line Options:
Options | Description |
---|---|
--s-recursive |
Recursively search a directory for parquet files instead of only searching the top level of the directory specified by the file path parameter. For example, given the directory below: /path/to/directory/
├── example1.parquet
├── example2.parquet
├── subdirectory/
│ ├── example4.parquet
│ └── example5.parquet
└── example3.parquet
Specifying |
/file/path/.<file-name>.parquet
) using the Odo engineImports from a Parquet Dataset; exporting to Parquet Datasets via KIO is
currently not supported. For help with command line usage, provide the
--show-usage
parameter with the parquetdataset
option.
Tip
Apache Parquet Dataset files can also transferred to/from an S3 bucket. See Transferring files via S3 for details.
URI Parameters:
Parameter | Description |
---|---|
/file/path/directory.parquetdataset |
Absolute path to a Parquet Dataset directory |
General:
/file/path/*.parquet
as the source URI instead to ingest the files as
chunked Parquet files.--d-property-additions
, --d-property-overrides
,
and --d-base-type-overrides
to establish the data's types as desired.
Refer to Kinetica for more information.AWS S3 Apache Parquet Dataset:
Imports from or exports to a CSV file. For help with command line usage, provide
the --show-usage
parameter with the csv
option.
Tip
CSV files can also transferred to/from an S3 bucket. See Transferring files via S3 for details.
URI Parameters:
Parameter | Description |
---|---|
/file/path/filename.csv |
Absolute or relative filepath to a CSV file |
Important
Do not specify single or double quotes around the file path, as KIO will interpret that as the name of the file, rather than a file path.
Command Line Options:
Options | Description |
---|---|
--d-delimiter <delimiter> |
Set the delimiter used in the file when csv is the destination format. Any alphanumeric character or
symbol is a valid delimiter; default is , . |
--s-delimiter <delimiter> |
Set the delimiter used in the file when Important The option must match the delimiter in the file or the import will fail. |
--d-escape-character <character> |
Set the character used to escape error-prone characters, e.g., , , "" , etc., when csv is the
destination format. |
--s-escape-character <character> |
Set the character used to escape error-prone characters, e.g., Important The option must match the escape character used in the file or the import will fail. |
--s-custom-null-value <value> |
Set the null value literal when Important This option is only available when using the Spark engine. |
--s-custom-quote-value <value> |
Set the quote symbol that will replace default double quotes ( Important This option is only available when using the Spark engine. |
--s-custom-comment-value <value> |
Set the comment symbol when Important This option is only available when using the Spark engine. |
--s-headerless |
Specifies if the source CSV file is headerless, e.g., the first row in the CSV file is data instead of column headers. By default, Kinetica assumes all source CSV files have headers as the first row. |
--dynamically-infer-wkt-subtypes |
Dynamically infer whether columns with base type Important This option is only available for CSV to Kinetica transfers using the Odo engine. |
--s-drop-malformed-rows |
Indicates rows of a source CSV file will not be inserted if either of the following is true:
Important This option is only available when using the Spark engine and when |
If CSV is the destination and the target file(s) does not exist, column headers will be written to the file first and then data will be appended. If the target file(s) does exist, only data will be appended.
If importing from a CSV into a non-existing table in Kinetica, the table will be created. The database will infer column base types and null values depending on the data being imported.
KIO only supports wildcard ingestion of multiple CSVs (/file/path/*.csv
)
using the Spark engine
KIO only supports hidden CSV file ingestion
(/file/path/.<file-name>.csv
) using the Odo engine
If importing from a CSV into an existing table in Kinetica, the first row of the CSV file must contain a delimiter-separated (based on the user-specified delimiter) list of the column names in the order they appear in the existing table.
Warning
If the first row of the CSV file is a list of column names that don't match with the existing table, KIO will attempt to insert that row as data instead, which could result in import errors.
Imports from a shapefile or exports to another shapefile. For help with command
line usage, provide the --show-usage
parameter with the shapefile
option.
Tip
Shapefiles can also transferred to/from an S3 bucket. See Transferring files via S3 for details.
URI Parameters:
Parameter | Description |
---|---|
/file/path/filename.shapefile |
Absolute filepath to a shapefile |
Note
The URI must be wrapped in quotes (''
or ""
) should a special
character be present that could confuse the command prompt
.shapefile
file, which maps to a main file
(.shp
) and sometimes an index file (.shx
) and a dBASE file
(.dbf
). KIO requires that at least the .shapefile
and .shp
files
be available. These files (and any other files) must be in the same directory
and have the same prefix to be recognized by KIO, e.g.,
usa-coastline.shapefile
, usa-coastline.shp
, etc./file/path/*.shapefile
)
into the same destination data format provided that the schema for all files
is identical, i.e. column names don't have to match but the rows of data
therein must be of the same type. If a file's schema does not match the other
files being ingested, the file will be ignored and the ingestion will
continue. If the column names do not match, the first file ingested will
determine the column names.You can also import from or export to supported file formats in an S3 bucket.
For help with command line usage, provide the --show-usage
parameter with
the s3
option. AWS S3 in KIO isn't necessarily a separate format but
rather a way to access an S3 bucket for file transfers; all associated
parameters, options, and limitations apply when using a supported file format
within an S3 URI. The following file formats are supported with S3 KIO
operations:
Base S3 URI:
s3://<bucket name>/<file path>
Note
The URI must be wrapped in quotes (''
or ""
) should a special
character be present that could confuse the command prompt
URI Parameters:
Parameter | Description |
---|---|
<bucket name> |
The unique name given to the S3 bucket in your AWS account. |
<file path> |
The file path to the desired file(s) contained in the specified S3 bucket. |
Command Line Options:
Options | Description |
---|---|
--aws-access-key-id <key> |
Sets the AWS access key ID for an AWS user with access permissions to both the source and destination (if both are AWS resources). |
--aws-secret-access-key <key> |
Sets the AWS secret access key for an AWS user with access permissions to both the source and destination (if both are AWS resources). |
Important
If the S3 bucket has public access, the access key and secret access key are not required.
If not using the access key ID and secret access key command line options, your machine can be configured for direct AWS S3 access. Once you've created your AWS access key ID and AWS secret access key:
Install the official AWS CLI on your machine using pip
:
pip install awscli
Add your AWS account credentials to the machine. You can find the default
region name in the URL for your AWS console; the default output format can
be left as the default (None
):
$ aws configure
AWS Access Key ID [None]: <access-key-id>
AWS Secret Access Key [None]: <secret-access-key>
Default region name [None]: <region-name>
Default output format [None]:
Verify the .aws
directory is available in your root directory:
$ ls -al ~/
total 500
dr-xr-x---. 10 root root 4096 Jun 23 12:25 .
dr-xr-xr-x. 17 root root 4096 Jun 23 11:49 ..
drwxr-xr-x 2 root root 39 Jun 23 12:25 .aws
...
Verify you can read the contents of your AWS S3 bucket:
aws s3 ls <your-bucket-name>
If the file does not exist when exporting to S3, it will be created; if the file does exist, it will be overwritten.
KIO supports wildcard ingestion of files on AWS S3 (e.g.,
s3://my-s3-bucket/*.csv
, s3://my-s3-bucket/*.shapefile
) into the
same destination data format provided that the schema for all files is
identical, i.e. column names don't have to match but the rows of data therein
must be of the same type. If a file's schema does not match the other files
being ingested, the file will be ignored and the ingestion will continue. If
the column names do not match, the first file ingested will determine the
column names
Important
If a prefix directory is provided, e.g.,
s3://my-s3-bucket/test/dir/*.csv
, it will be recursively searched,
and any .csv
files found will being copied to the destination.
The AWS credentials passed in via command line must have permissions to both the source and destination resources. KIO does not support separate AWS credentials for the source and destination resources.
Importing from a CSV file into the taxi_trip_data
table in Kinetica, adding a property, and overriding
one base type:
./kio -s /home/john_smith/files/taxi_trip_data.csv -d kinetica://::taxi_trip_data --d-property-additions '{"vendor_id": ["text_search"]}' --d-base-type-overrides '{"trip_distance": "double"}'
Exporting from the flights
table in Kinetica to a CSV file on the local
machine:
./kio -s kinetica://localhost:9191::flights -d /home/jane_smith/flights_data.csv
Exporting from PostgreSQL into Kinetica and overriding a property:
./kio -s postgresql://jsmith1:password123@172.123.45.67:5432/MASTER::nyctaxi -d 'kinetica://jsmith1:password234$@172.123.45.67:9191::taxi_data' --d-property-overrides '{"passenger_count": ["int16"]}'
Importing from CSV files in the specified S3 directory into Kinetica:
./kio -s 's3://my-s3-bucket/csv/*.csv' -d kinetica://::my_test_data
Exporting from the taxi_trip_data
table in Kinetica to a CSV file in an
S3 bucket:
./kio -s kinetica://::taxi_trip_data -d s3://my-s3-bucket/taxi_trip_data.csv
Importing from a shapefile into Kinetica:
./kio -s /tmp/data/usa-coastline.shapefile -d kinetica://::usa_coastline_data
Importing from taxi_trip_data
, a Parquet Dataset directory, in an AWS S3
bucket to a table in a collection in Kinetica:
./kio -s s3://data-bucket/taxi_trip_data.parquetdataset -d kinetica://::taxi_trip_data --d-collection-name my_collection
Importing from a local CSV file into Kinetica securely:
./kio -s /home/my-data.csv -d kinetica://jsmith:password234@172.123.45.67:8082/gpudb-0::my_data_table --d-https
Importing from a PostGIS-enabled PostgreSQL database into Kinetica:
./kio -s 'postgresql://postgres:password123@172.123.45.67:5432/postgres::roads' -d 'kinetica://::roads' --d-property-additions '{"roads_geom": ["wkt"]}' --gis-ext-enabled --engine spark
Importing from Microsoft SQL Server into Kinetica:
./kio -s sqlserver://jsmith:temp1234@172.123.45.67:1433/gpudb::nyctaxi -d kinetica://::nyctaxi
Importing from Oracle into Kinetica:
./kio -s oracle://jsmith:temp1234@172.123.45.67:1521/orcl::nyctaxi -d kinetica://::nyctaxi
Importing from Sybase IQ into Kinetica:
./kio -s sybaseiq://jsmith:sql123@172.123.45.67:2638/kindata::nyctaxi -d kinetica://::nyctaxi
Importing from Teradata into Kinetica:
./kio -s teradata://dbcadmin:password123@172.123.45.67:1025/gpudb::nyctaxi -d kinetica://::nyctaxi
KIO is distributed with each Kinetica database installation and will always be compatible with its local database instance.
Additionally, KIO v6.2 is forward-compatible with Kinetica v7.0 and can be used to transfer data between the two.
Note
KIO v7.0 is not backward-compatible with Kinetica versions prior to v7.0. If migrating data from a Kinetica v6.2 database to a Kinetica v7.0 database, the KIO on the Kinetica v6.2 host should be used to initiate the transfer.