Kinetica Docker installation and configuration instructions.
Prerequisites
License Key
Installing Kinetica within a Docker container requires a license key. To receive a license key, contact support at support@kinetica.com.
Networking Configuration
The docker run commands given in Install Kinetica Docker Image will make several Kinetica ports available on the host:
- 8080 - Administration Web Application
- 8088 - Reveal
- 9191 - Database API Service
Make sure these are available on the host system, or adjust the docker run command accordingly. More information on port adjustment can be found under Ports.
GPU Software
For GPU based systems, supporting software must be installed:
- Nvidia Drivers - see Nvidia Drivers for details on installing drivers on a Unix host
- Nvidia Docker - see Nvidia Docker on GitHub for details
Install Kinetica Docker Image
Both GPU-based & Intel-based versions of Kinetica are available via Docker, to suit the target installation environment. Run the docker command below that matches the host server.
GPU Based System
CUDA 8.0
On nvidia-docker v1:
1 2 3
nvidia-docker run \ -p 8080:8080 -p 8088:8088 -p 9191:9191 \ kinetica/kinetica-cuda80:latest
On nvidia-docker v2:
1 2 3
docker run --runtime=nvidia \ -p 8080:8080 -p 8088:8088 -p 9191:9191 \ kinetica/kinetica-cuda80:latest
CUDA 9.0
On nvidia-docker v1:
1 2 3
nvidia-docker run \ -p 8080:8080 -p 8088:8088 -p 9191:9191 \ kinetica/kinetica-cuda90:latest
On nvidia-docker v2:
1 2 3
docker run --runtime=nvidia \ -p 8080:8080 -p 8088:8088 -p 9191:9191 \ kinetica/kinetica-cuda90:latest
CUDA 9.1
On nvidia-docker v1:
1 2 3
nvidia-docker run \ -p 8080:8080 -p 8088:8088 -p 9191:9191 \ kinetica/kinetica-cuda91:latest
On nvidia-docker v2:
1 2 3
docker run --runtime=nvidia \ -p 8080:8080 -p 8088:8088 -p 9191:9191 \ kinetica/kinetica-cuda91:latest
Non-GPU Based System
|
|
Initialize the Database
The database will have to be initialized before it can be started. This will be accomplished using the Visual Installer.
Note
As this container is the node through which all interaction with Kinetica is performed, it is considered the head node, and will be referred to in that manner in this section.
The Visual Installer is run through the Kinetica Administration Application (GAdmin) and simplifies the installation of Kinetica.
Browse to the head node, using IP or host name:
http://localhost:8080/
Once you've arrived at the login page, you'll need to change your password and initialize the system using the following steps:
Log into the admin application
- Enter Username: admin
- Enter Password: admin
- Click Login
If a license key has not already been configured, a Product Activation page will be displayed, where the license key is to be entered:
- Enter the license key under Enter License Key
- When complete, click Activate, then confirm the activation
At the Setup Wizard page, configure the system basics:
- Enter the IP Address and number of GPUs (if any) for each server in the cluster
- Optionally, select the Public Head IP Address checkbox and update the address as necessary
- The license key under Configure License Key should already be populated
- When complete, click Save
Important
For additional configuration options, see the Configuration Reference.
Start the system. This will start all Kinetica processes on the head node, and if in a clustered environment, the corresponding processes on the worker nodes.
- Click Admin on the left menu
- Click Start.
See Changing the Administrator Password for instructions on updating the administration account's password.
Configuration Options & Considerations
The default Docker configuration can be modified in several ways.
Automating Configuration on Docker Run
To customize the Kinetica instance running in the Docker container, overriding
any of the parameters normally found in /opt/gpudb/core/etc/gpudb.conf
,
a gpudb.conf
file can be created inside the
mounted persist folder
(specified in the -v option to docker run) and have any configuration
parameter overrides added to it. This file only needs to contain the
configuration items from the default gpudb.conf
that are to be
overridden.
For example, to set the license key to my_license_key, create a
gpudb.conf
on the host in the folder being bind mounted to
/opt/gpudb/persist
in the container, and add the following contents:
[gaia] license_key = my_license_key
Automating Start on Docker Run
When the Kinetica container is started, by default it will only start the Host Manager and the Kinetica Administration Application (GAdmin). To also start the database, pass the FULL_START=1 environment variable to the docker run command. For example:
docker run -e FULL_START=1 ...
GPUs
Configuration of GPUs is slightly different when running Kinetica inside a
container. On servers with multiple GPUs, other applications may be using some
of the GPUs on the system. Kinetica will re-order the indexes of GPUs inside
the Kinetica container in order of largest amount of free memory first. The
rankN.taskcalc_gpu settings in gpudb.conf
should be set to
0 for the most available GPU on the host, 1 for second most available,
and so on. In most cases, this will result in a configuration file looking
something like the following:
|
|
This way, Kinetica will always choose the most available GPUs on start, even if those are not indexed that way by default.
In addition, Kinetica will also filter the list of GPUs based on a minimum available memory criterion. By default, it will filter out any GPUs with less than 1GB of free memory. This filter can be modified by passing the MINIMUM_GPU_MEMORY environment variable (in megabytes) to docker.
For example, to change the minimum GPU memory filter to 500MB:
|
|
RAM
By default, the RAM Tier in Kinetica is configured to use as much RAM as possible. In order to avoid running out of memory when performing large data operations, it is recommended that the RAM Tier be limited using the following guidelines:
- Head Node
- Rank0: 10% of system memory
- Other ranks: 70% of system memory / # worker ranks
- Worker Node
- Worker ranks: 80% of system memory / # worker ranks
For example, a single-node instance with 4GB of RAM and 2 worker ranks can be
configured by modifying the /opt/gpudb/core/etc/gpudb.conf
as follows:
|
|
Persist / Volumes
When run, Kinetica will create a Docker Volume for /opt/gpudb/persist
.
On the host, Docker automatically creates & manages a directory that backs this
container directory. By default, this volume remains across container runs and
even if the container is deleted.
There are several aspects of this behavior that can be modified.
Removing Volume Data
To remove the default Docker Volume after each container run, use the --rm option to the docker run command. Note that this will also remove the Docker container itself.
For example, to run the Intel-based Kinetica Docker container, removing the default Docker Volume after the end of the run:
|
|
Important
Using this option will require the database to be reinitialized after each run, as detailed in the Initialize the Database section.
Reusing Persist
If a docker container is stopped and deleted and the persist directory & files remain, they can be re-used with a new Kinetica container. Kinetica will, in this case, attempt to re-use the configuration from the last time the database was started.
To override this behavior, put an empty gpudb.conf
in the persist folder
(or provide any override configuration parameters as outlined above), and the
container will be started with a fresh configuration.
Mounting a Volume
The default volume creation can be overridden with a bind mount by using the --volume option (-v) on the command line, specifying a directory on the host that should be made available in the Docker container.
For example, to run the Intel-based Kinetica Docker container, mounting the
host directory /home/kinetica/persist
to /opt/gpudb/persist
within the container, use the -v option:
|
|
Note that for security, Kinetica runs in the container as a non-root user, Unix
UID
and GID
65432. Due to the mechanics of Docker
containers, any host folder bind mounted inside of the container must have
permissions on the host set such that it is accessible to user/group
65432:65432. To make this assignment more straightforward, reserve
corresponding host user accounts for both the Kinetica user and the Kinetica UDF
user, mapping the Kinetica user to that ID:
|
|
Now, that account can be used to change ownership of any mounted folders before running the container.
|
|
For example, to set permissions on a host directory named
/home/kinetica/persist
so that it is available to be bind mounted in the
container, run the following command on the host:
|
|
Managing Volumes
To get information about the Mounts on a particular container, including the volume that is automatically created for Kinetica containers, run:
|
|
The standard docker volume commands can then be used to manage the volume. Note that storing information in a Docker Volume is convenient for getting up and running quickly, but performance penalties may be incurred when using one to store persist information. An alternate persistent storage method is recommended in a production environment where performance is a concern.
Ports
Kinetica communicates with the outside over a set of Ports. You can choose to expose any of these ports when the container is run by using the -P or --publish-all option to publish all ports exposed in the image.
Note that these ports will be randomly mapped in this case. To map container ports to specific host ports, use the docker port <container-id> command, or individually map ports using the --port (-p) option. Port mapping is of the format:
|
|
Unless specified, <host_ip> will listen on all host IPs, <host_port> is assigned a random available port, and <protocol> is tcp. Docker will bind these ports on the host when the container is run, not only when they are requested by Kinetica, and will fail if one of the specified host ports are unavailable.
Alternately, the docker container can be run with --net=host, at which point no port mappings are necessary, as the docker container will open ports directly on the host.
Privileged Mode
Some systems will necessitate Docker containers being run in privileged mode to function properly. To run a container in privileged mode, use the --privileged option when running the docker run command.
For example, to run the Intel-based Kinetica Docker container in privileged mode:
|
|
Troubleshooting
If the following error is encountered while attempting to run the container, try running in privileged mode:
|
|