Load Data from Local File

Copy and paste tutorial for loading data from a local file on your computer

To load data from your local computer, you can:

  • Use the command line from your computer with Kinetica's KiSQL tool that allows you to upload the file to the Kinetica File System (KiFS) and initiate the load into the table with SQL.
  • Or, use a third party SQL tool like DBeaver with Kinetica's JDBC driver and follow the same instructions for KiSQL. Usage of a third party tool is outside the scope of this article. Refer to the JDBC driver page for more information.

KiSQL

Download the appropriate version of KiSQL for your instance of Kinetica.

Run KiSQL from the CLI and connect to Kinetica:

Connect to Kinetica with KiSQL
1
kisql --url http://localhost:9191;RowsPerFetch=20000 --user auser

Next, create a directory in KiFS using the CREATE DIRECTORY command:

Createa a Directory in KiFS
1
CREATE DIRECTORY 'my_directory'

Then, upload your file(s) to KiFS using the UPLOAD FILE command:

Upload a File to KiFS
1
UPLOAD FILES 'my_data.csv' INTO 'my_directory'

To initiate the ingest of data into a Kinetica table, use the LOAD INTO command for a one-time batch load. To reference files in KiFS, you can use a URL that looks like this:

kifs://<folder_name>/<filename>.<extension>
Load Data from File in KiFS
1
2
3
LOAD INTO ki_home.my_data
FROM FILE PATHS 'kifs://my_directory/my_data.csv'
WITH OPTIONS (PRIMARY KEY = ('id'))

Considerations

The following are some common options used when loading. For the full list of options, see LOAD INTO. For copy/paste examples of many of the options, see Loading Data.

Error Handling

Kinetica has two different error handling modes for dealing with erroneous data. To halt ingestion after a bad record is found, use the ABORT mode. To skip erroneous records and continue the ingest, use the SKIP mode.

To inspect erroneous records, you may use the BAD RECORD TABLE NAME option. All bad records encountered will be stored there for review. The bad records table is limited to 10,000 records by default and may be overridden using the BAD RECORD TABLE LIMIT option.

Abort
1
2
3
4
5
6
LOAD INTO ki_home.error_example
FROM FILE PATHS 'kifs://data/orders.csv'
FORMAT TEXT
WITH OPTIONS (
    ON ERROR = ABORT
)
Log Bad Records
1
2
3
4
5
6
7
LOAD INTO ki_home.error_example
FROM FILE PATHS 'kifs://data/orders.csv'
FORMAT TEXT
WITH OPTIONS (
    BAD RECORD TABLE NAME = 'error_example_invalid',
    ON ERROR = SKIP
)

Load Specific Columns

In some cases, you may only want to store certain columns from your source data. Use the FIELDS MAPPED BY NAME(...) option, which allows you to specify the desired fields to store in Kinetica.

Load Specific Columns Example
1
2
3
4
5
6
LOAD DATA INTO example.orders
FROM FILE PATHS 'kifs://data/orders.csv'
FORMAT TEXT
WITH OPTIONS (
    FIELDS MAPPED BY NAME(ID, Name, Product_ID, Quantity)
)

DateTime Formatting

Use the COLUMN FORMATS option to format date and time fields into Kinetica Date, Time, and DateTime columns. Time formats are specified using a JSON formatted string. Non-placeholder characters must be wrapped in quotes, which must also be escaped (e.g. '{"dt": {"date": "\"(\"YYYY\")\" Mon, DD"}}'). Alternatively, you can use the ? character as a wildcard character. Note that Kinetica does not handle or store timezones and they will be discarded. See the full list of supported date and time format codes.

ISO 8601 Timestamps
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
-- Example Data:
-- dt
-- 2022-01-19T15:50:42Z+05:00

LOAD INTO ki_home.timestamp_example
FROM FILE PATHS 'kifs://data/orders.csv'
FORMAT TEXT
WITH OPTIONS (
    COLUMN FORMATS = '{"dt": {"datetime": "YYYY-MM-DD?HH:MI:SS"}}'
)
Custom Date
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
-- Example Data:
-- d
-- (2022) Feb, 22

LOAD INTO ki_home.date_example
FROM FILE PATHS 'kifs://data/orders.csv'
FORMAT TEXT
WITH OPTIONS (
    COLUMN FORMATS = '{"d": {"date": "\"(\"YYYY\")\" Mon, DD"}}'
)
Custom Time
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
-- Example Data:
-- t
-- 18-27-59.5536

LOAD INTO ki_home.time_example
FROM FILE PATHS 'kifs://data/orders.csv'
FORMAT TEXT
WITH OPTIONS (
    COLUMN FORMATS = '{"t": {"time": "HH-MI-SS.MS"}}'
)