Version:

Kinetica Installation with Docker

Kinetica Docker installation and configuration instructions.

Prerequisites

License Key

Installing Kinetica within a Docker container requires a license key. To receive a license key, contact support at support@kinetica.com.

Networking Configuration

The docker run commands given in Install Kinetica Docker Image will make several Kinetica ports available on the host:

  • 8080 - Administration Web Application
  • 8088 - Reveal
  • 9191 - Database API Service

Make sure these are available on the host system, or adjust the docker run command accordingly. More information on port adjustment can be found under Ports.

GPU Software

For GPU based systems, supporting software must be installed:

Install Kinetica Docker Image

Both GPU-based & Intel-based versions of Kinetica are available via Docker, to suit the target installation environment. Run the docker command below that matches the host server.

GPU Based System

CUDA 8.0

  • On nvidia-docker v1:

    nvidia-docker run \
                  -p 8080:8080 -p 8088:8088 -p 9191:9191 \
                  kinetica/kinetica-cuda80:latest
    
  • On nvidia-docker v2:

    docker run --runtime=nvidia \
           -p 8080:8080 -p 8088:8088 -p 9191:9191 \
           kinetica/kinetica-cuda80:latest
    

CUDA 9.0

  • On nvidia-docker v1:

    nvidia-docker run \
                  -p 8080:8080 -p 8088:8088 -p 9191:9191 \
                  kinetica/kinetica-cuda90:latest
    
  • On nvidia-docker v2:

    docker run --runtime=nvidia \
           -p 8080:8080 -p 8088:8088 -p 9191:9191 \
           kinetica/kinetica-cuda90:latest
    

CUDA 9.1

  • On nvidia-docker v1:

    nvidia-docker run \
                  -p 8080:8080 -p 8088:8088 -p 9191:9191 \
                  kinetica/kinetica-cuda91:latest
    
  • On nvidia-docker v2:

    docker run --runtime=nvidia \
           -p 8080:8080 -p 8088:8088 -p 9191:9191 \
           kinetica/kinetica-cuda91:latest
    

Non-GPU Based System

docker run \
       -p 8080:8080 -p 8088:8088 -p 9191:9191 \
       kinetica/kinetica-intel:latest

Initialize the Database

The database will have to be initialized before it can be started. This will be accomplished using the Visual Installer.

Note

As this container is the node through which all interaction with Kinetica is performed, it is considered the head node, and will be referred to in that manner in this section.

The Visual Installer is run through the Kinetica Administration Application (GAdmin) and simplifies the installation of Kinetica.

Browse to the head node, using IP or host name:

http://localhost:8080/

Once you've arrived at the login page, you'll need to change your password and initialize the system using the following steps:

  1. Log into the admin application

    1. Enter Username: admin
    2. Enter Password: admin
    3. Click Login
  2. If a license key has not already been configured, a Product Activation page will be displayed, where the license key is to be entered:

    ../_images/product_activation.png
    1. Enter the license key under Enter License Key
    2. When complete, click Activate, then confirm the activation
  3. At the Setup Wizard page, configure the system basics:

    1. Enter the IP Address and number of GPUs (if any) for each server in the cluster
    2. Optionally, select the Public Head IP Address checkbox and update the address as necessary
    3. The license key under Configure License Key should already be populated
    4. When complete, click Save

    Important

    For additional configuration options, see the Configuration Reference.

  4. Start the system. This will start all Kinetica processes on the head node, and if in a clustered environment, the corresponding processes on the worker nodes.

    1. Click Admin on the left menu
    2. Click Start.
  5. See Changing the Administrator Password for instructions on updating the administration account's password.

Configuration Options & Considerations

The default Docker configuration can be modified in several ways.

Automating Configuration on Docker Run

To customize the Kinetica instance running in the Docker container, overriding any of the parameters normally found in /opt/gpudb/core/etc/gpudb.conf, a gpudb.conf file can be created inside the mounted persist folder (specified in the -v option to docker run) and have any configuration parameter overrides added to it. This file only needs to contain the configuration items from the default gpudb.conf that are to be overridden.

For example, to set the license key to my_license_key, create a gpudb.conf on the host in the folder being bind mounted to /opt/gpudb/persist in the container, and add the following contents:

[gaia]
license_key = my_license_key

Automating Start on Docker Run

When the Kinetica container is started, by default it will only start the Host Manager and the Kinetica Administration Application (GAdmin). To also start the database, pass the FULL_START=1 environment variable to the docker run command. For example:

docker run -e FULL_START=1 ...

GPUs

Configuration of GPUs is slightly different when running Kinetica inside a container. On servers with multiple GPUs, other applications may be using some of the GPUs on the system. Kinetica will re-order the indexes of GPUs inside the Kinetica container in order of largest amount of free memory first. The rankN.taskcalc_gpu settings in gpudb.conf should be set to 0 for the most available GPU on the host, 1 for second most available, and so on. In most cases, this will result in a configuration file looking something like the following:

...
rank1.taskcalc_gpu = 0
rank2.taskcalc_gpu = 1
...

This way, Kinetica will always choose the most available GPUs on start, even if those are not indexed that way by default.

In addition, Kinetica will also filter the list of GPUs based on a minimum available memory criterion. By default, it will filter out any GPUs with less than 1GB of free memory. This filter can be modified by passing the MINIMUM_GPU_MEMORY environment variable (in megabytes) to docker.

For example, to change the minimum GPU memory filter to 500MB:

docker run -e MINIMUM_GPU_MEMORY=500 ...

RAM

By default, the RAM Tier in Kinetica is configured to use as much RAM as possible. In order to avoid running out of memory when performing large data operations, it is recommended that the RAM Tier be limited using the following guidelines:

  • Head Node
    • Rank0: 10% of system memory
    • Other ranks: 70% of system memory / # worker ranks
  • Worker Node
    • Worker ranks: 80% of system memory / # worker ranks

For example, a single-node instance with 4GB of RAM and 2 worker ranks can be configured by modifying the /opt/gpudb/core/etc/gpudb.conf as follows:

tier.ram.rank0.limit = 429496729
tier.ram.rank1.limit = 1503238553
tier.ram.rank2.limit = 1503238553

Persist / Volumes

When run, Kinetica will create a Docker Volume for /opt/gpudb/persist. On the host, Docker automatically creates & manages a directory that backs this container directory. By default, this volume remains across container runs and even if the container is deleted.

There are several aspects of this behavior that can be modified.

Removing Volume Data

To remove the default Docker Volume after each container run, use the --rm option to the docker run command. Note that this will also remove the Docker container itself.

For example, to run the Intel-based Kinetica Docker container, removing the default Docker Volume after the end of the run:

docker run \
       --rm \
       -p 8080:8080 -p 8088:8088 -p 9191:9191 \
       kinetica/kinetica-intel:latest

Important

Using this option will require the database to be reinitialized after each run, as detailed in the Initialize the Database section.

Reusing Persist

If a docker container is stopped and deleted and the persist directory & files remain, they can be re-used with a new Kinetica container. Kinetica will, in this case, attempt to re-use the configuration from the last time the database was started.

To override this behavior, put an empty gpudb.conf in the persist folder (or provide any override configuration parameters as outlined above), and the container will be started with a fresh configuration.

Mounting a Volume

The default volume creation can be overridden with a bind mount by using the --volume option (-v) on the command line, specifying a directory on the host that should be made available in the Docker container.

For example, to run the Intel-based Kinetica Docker container, mounting the host directory /home/kinetica/persist to /opt/gpudb/persist within the container, use the -v option:

docker run \
       -p 8080:8080 -p 8088:8088 -p 9191:9191 \
       -v /home/kinetica/persist:/opt/gpudb/persist \
       kinetica/kinetica-intel:latest

Note that for security, Kinetica runs in the container as a non-root user, Unix UID and GID 65432. Due to the mechanics of Docker containers, any host folder bind mounted inside of the container must have permissions on the host set such that it is accessible to user/group 65432:65432. To make this assignment more straightforward, reserve corresponding host user accounts for both the Kinetica user and the Kinetica UDF user, mapping the Kinetica user to that ID:

sudo groupadd -g 65432 -r kinetica
sudo useradd  -u 65432 -g kinetica -M -r -s /bin/bash -c 'Kinetica User' kinetica
sudo groupadd -g 65433 -r kinetica_proc
sudo useradd  -u 65433 -g kinetica_proc -M -r -s /bin/bash -c 'Kinetica Proc User' kinetica_proc

Now, that account can be used to change ownership of any mounted folders before running the container.

sudo chown kinetica:kinetica </path/on/host>

For example, to set permissions on a host directory named /home/kinetica/persist so that it is available to be bind mounted in the container, run the following command on the host:

sudo chown kinetica:kinetica /home/kinetica/persist

Managing Volumes

To get information about the Mounts on a particular container, including the volume that is automatically created for Kinetica containers, run:

docker inspect --format '{{ json .Mounts }}' <container id>

The standard docker volume commands can then be used to manage the volume. Note that storing information in a Docker Volume is convenient for getting up and running quickly, but performance penalties may be incurred when using one to store persist information. An alternate persistent storage method is recommended in a production environment where performance is a concern.

Ports

Kinetica communicates with the outside over a set of Default Ports. You can choose to expose any of these ports when the container is run by using the -P or --publish-all option to publish all ports exposed in the image.

Note that these ports will be randomly mapped in this case. To map container ports to specific host ports, use the docker port <container-id> command, or individually map ports using the --port (-p) option. Port mapping is of the format:

--port <host_ip>:<host_port>:<container_port>/<protocol>

Unless specified, <host_ip> will listen on all host IPs, <host_port> is assigned a random available port, and <protocol> is tcp. Docker will bind these ports on the host when the container is run, not only when they are requested by Kinetica, and will fail if one of the specified host ports are unavailable.

Alternately, the docker container can be run with --net=host, at which point no port mappings are necessary, as the docker container will open ports directly on the host.

Privileged Mode

Some systems will necessitate Docker containers being run in privileged mode to function properly. To run a container in privileged mode, use the --privileged option when running the docker run command.

For example, to run the Intel-based Kinetica Docker container in privileged mode:

docker run \
       --privileged \
       -p 8080:8080 -p 8088:8088 -p 9191:9191 \
       kinetica/kinetica-intel:latest

Troubleshooting

If the following error is encountered while attempting to run the container, try running in privileged mode:

docker run \
       -p 8080:8080 -p 8088:8088 -p 9191:9191
       -v /tmp/kinetica:/opt/gpudb/persist
       kinetica/kinetica-intel:latest
...
could not open session
could not open session
Error starting host manager with /opt/gpudb/core/bin/gpudb as user: gpudb, error code 1