Version:

Kinetica File System (KiFS)

The Kinetica File System (KiFS) is a filesystem interface that's packaged with Kinetica. KiFS must be enabled via the gpudb.conf file and is only accessible to the gpudb_proc user. Once KiFS is enabled, the filesystem can be browsed from Kinetica Administration Application (GAdmin). KiFS is particularly useful for UDF machine learning model distribution, removing the need to manually distribute the model to each node in a cluster.

Command Line Interface

One can interface with KiFS using the /opt/gpudb/bin/kifs script. Using the script instead of the default mount (see Enabling the Default Mount) is useful when another mount point is necessary, either for a separate user or on a node that is not part of the cluster.

Command Line Options:

Parameter Description
--username <username> Username required to connect to Kinetica when authentication is enabled
--password <password> Password required to connect to Kinetica when authentication is enabled
--host <host> Kinetica instance to connect to; default is localhost
--port <port> Kinetica instance connection port; default is 9191
--mount_point </dir/path> Custom mount point for KiFS. The specified directory will be created if it doesn't exist. The mount point will be removed on exit; default mount point is /root/kifs_v1_mount
-h Display help information

Enabling the Default Mount

  1. Navigate to /opt/gpudb/core/etc/gpudb.conf and update the following setting to true:

    enable_kifs = true
    
  2. Optionally, update the default parent directory mount point for the filesystem. The actual mount point will be a subdirectory, mount, below this directory. This directory must have read, write, and execute permissions enabled for the gpudb user and gpudb_proc group and must not be located on a Network File System (NFS):

    kifs_mount_point = /opt/gpudb/kifs
    
  3. Restart Kinetica. The mount is ready and the KiFS Browser section will be accessible via the Data menu. A filesystem collection and internal kifs user will also be added.

    Important

    Additional users outside of the default gpudb_proc user can be granted access to the mount. To do so, user_allow_other must be enabled in /etc/fuse.conf and the users being granted access to KiFS must be added to the gpudb_proc group; e.g.,

    sudo usermod -a -G gpudb_proc <username>
    

Adding Data to KiFS

Once the KiFS mount is available, it is ready to store data. Adding data to KiFS is achieved either via a cp/scp command (since it works like a normal file system) or via Kinetica Administration Application (GAdmin). Note that files cannot be stored in the root directory of the mount. Files can only be stored in the directories directly under root (no nesting). For example:

# Make a valid sub-directory
mkdir /opt/gpudb/kifs/mount/my_dir

# Correct storage
cp filename.ext /opt/gpudb/kifs/mount/my_dir

# Incorrect storage
cp filename.ext /opt/gpudb/kifs/mount/
cp: cannot create regular file ‘/opt/gpudb/kifs/mount/filename.ext’: No such file or directory

# Attempt to make a nested sub-directory
mkdir /opt/gpudb/kifs/mount/my_dir/my_nested_dir
mkdir: cannot create directory ‘/opt/gpudb/kifs/mount/my_dir/my_nested_dir’: No such file or directory

Each directory created in the root mount directory will have a corresponding table in the filesystem collection. Each file added to a directory in the mount will have a corresponding record in the table it is "stored" in, e.g., if the KiFS mount directory looks like this:

/opt/gpudb/kifs/mount/
└── procs
   ├── udf_sos_py_exec.py
   ├── udf_sos_py_init.py
   └── udf_sos_py_proc.py

The filesystem collection will contain a __kifs__procs table with the following records:

../_images/data_kifs_table_records.png

Copy / Secure Copy

If logged into a remote machine where KiFS is available and you have access to KiFS (see step 3 of Enabling the Default Mount), you can copy files from the remote machine into the mount like a normal directory:

cp filename.ext /opt/gpudb/kifs/mount/data/

If not logged into a machine where KiFS is available and you have access to KiFS (see step 3 of Enabling the Default Mount) but you want to copy local files to the remote mount:

scp filename.ext username@domain:/opt/gpudb/kifs/mount/data

Limitations and Cautions

  1. KiFS cannot be located on a Network File System (NFS)
  2. While the directories and files in the mount are available via the Tables section, the directories and folders should be browsed using the KiFS Browser
  3. Editing records in the filesystem collection corresponding to files in GAdmin can cause the file to disappear from the mount or be unusable
  4. The theoretical maximum file size that can be stored in the mount is equal to the maximum size of a bytes column, listed in the Column Types section under bytes; any file larger than this amount is truncated to fit in the mount. However, this size may be constrained further by available system resources.
  5. Interacting with KiFS using a third-party GUI can result in creating unsupported filesystem calls
  6. File metadata is not currently supported, e.g., file creation datetime, last accessed datetime, modified datetime, etc.