Configuration Reference¶

Kinetica configuration consists of several user-modifiable system parameters present in /opt/gpudb/core/etc/gpudb.conf that are used to set up and tune the system.

Network setup¶

Configuration parameter	default value	description
head_ip_address head_port	127.0.0.1 9191	Head HTTP server address and port. This specifies the publicly available IP address of the first process, rank 0.
use_https	false	Set to true to use HTTPS; if true then https_key_file and https_cert_file must be provided
https_key_file https_cert_file		Files containing the SSL private Key and the SSL certificate for. If required, a self signed certificate (expires after 10 years) can be generated via the command openssl req -newkey rsa:2048 -new -nodes -x509 -days 3650 -keyout key.pem -out cert.pem
http_allow_origin		Value to return via Access-Control-Allow-Origin HTTP header (for Cross-Origin Resource Sharing). Set to empty to not return the header and disallow CORS.
enable_httpd_proxy	false	Use a httpd server as a proxy to handle LDAP and/or Kerberos auththentication. Each host will run a httpd server and access to each rank is available through http://host:8082/gpudb-1, where port 8082 is defined by 'httpd_proxy_port'. NOTE: HTTPd external endpoints are not affected by the use_https parameter above. If you wish to enable https on HTTPd, you must edit the /opt/gpudb/httpd/conf/httpd.conf and setup https as per the Apache HTTPd documentation at https://httpd.apache.org/docs/2.2/
httpd_proxy_port	8082	TCP port that the httpd auth proxy server will listen on if 'enable_httpd_proxy' is true.
rank0_ip_address	${gaia.head_ip_address}	Internal use IP address of the head HTTP server, rank 0. Set to either a second internal network accessible by all ranks or to ${gaia.head_ip_address}.
trigger_port	9001	Trigger ZMQ publisher server port (-1 to disable), uses the 'head_ip_address' interface.
set_monitor_port	9002	Set monitor ZMQ publisher server port (-1 to disable), uses the 'head_ip_address' interface.
enable_odbc_connector	false	Enable ODBC connector
enable_caravel	false	Enable the Caravel runtime
enable_kibana_connector	false	Enable Kibana connector
global_manager_port_one	5552	Internal communication ports.
enable_worker_http_servers	false	Enable worker HTTP servers, each process runs its own server for direct ingest.
rank1.worker_http_server_port rank2.worker_http_server_port rank1.worker_http_server_port rank2.worker_http_server_port	9192 9193	Optionally, specify the worker HTTP server ports. The default is to use (head_port + rank#) for each worker process where rank number is from 1 to 'number_of_ranks' set below.
hostsfile		Filename of optional hosts file, if unspecified all ranks will be created on the current machine. Note that 'slots' must equal 'max_slots' and the the first address should be 127.0.0.1 . This is the file that is passed to 'mpirun --hostfile <...>' Sample hostname entry 172.30.20.4 slots=3 max_slots=3 ...
mpi_options		These options will be passed on the command line to mpirun. For example, setting.. mpi_options = --mca btl_tcp_if_exclude 10.28.6.0/24 ... will prevent MPI from binding to TCP IP addresses in the subnet 10.28.6.0/24
compress_network_data	false	Enables compression of inter-node network data transfers.
communicator_type	MPI	Communicator type (either MPI or ZMQ)

Security¶

Configuration parameter	default value	description
require_authentication	false	Require authentication.
enable_authorization	false	Enable authorization checks.
min_password_length	0	Minimum password length.

Licensing¶

Configuration parameter	default value	description
license_key	123	The license key to authorize running.

Process and thread configuration¶

Configuration parameter	default value	description
number_of_ranks	3	Set the number of ranks to create. The minimum number is 2, a single http head rank and one worker. This value is passed to 'mpirun -np <...>' to set the number of MPI ranks.
min_http_threads	2	Set min number of web server threads to spawn. (default: 2)
max_http_threads	512	Set max number of web server threads to spawn. (default: 512)
sm_omp_threads	4	Set the number of parallel jobs to create for multi-child set calculations. Use -1 to use the max number of threads (not recommended).
kernel_omp_threads	4	Set the number of parallel calculation threads to use for data processing. Use -1 to use the max number of threads (not recommended).
toms_per_rank	4	Set the number of TOMs per rank, the number of data container shards per rank.
tps_per_tom	4	Set the number of TaskProcessors per TOM, CPU data processors.
tcs_per_tom	4	Set the number of TaskCalculators per TOM, GPU data processors.

Hardware configuration¶

Configuration parameter	default value	description
rank0.gpu	0	Specify the GPU to use for all calculations on the HTTP server node, rank 0. Note that the rank0 GPU may be shared with another rank.
rank1.taskcalc_gpu rank2.taskcalc_gpu		Set GPU devices for each worker rank's TaskCalculators, see 'tcs_per_tom'. If no gpus are specified, each rank's TaskCalculators will share the same GPU and each rank will round-robin the available gpus on the system. Ideally, each rank should use a single specified GPU to allow data caching. Add rankN.taskcalc_gpu as needed, where N ranges from 1 to the param 'number_of_ranks'. For the example below the successively created taskcalcs use the 0,1,2 GPU devices respectively rank1.taskcalc_gpu = 0 1 2
rank0.numa_node		Set the head HTTP rank0 numa node(s). If left empty there will be no thread affinity or preferred memory node. The node list may be either a single node number or a range, E.G. 1-5,7,10. If there will be many simultaneous users, specify as many nodes as possible that won't overlap the rank1+ worker numa nodes that the GPUs are on. If there will be few simultaneous users and WMS speed is important, choose the numa node the 'rank0.gpu' is on.
rank1.base_numa_node rank2.base_numa_node		Set each worker rank's preferred base numa node for CPU affinity and memory allocation. The 'rankN.base_numa_node' is the node or nodes that non-data intensive threads will run in. These nodes do not have to be the same numa nodes that the GPU specified by the corresponding 'rankN.taskcalc_gpu' is on for best performance, though they should be relatively near to their 'rankN.data_numa_node'. There will be no CPU thread affinity or preferred node for memory allocation if not specified or left empty. The node list may be a single node number or a range, E.G. 1-5,7,10.
rank1.data_numa_node rank2.data_numa_node		Set each worker rank's preferred data numa node for CPU affinity and memory allocation. The 'rankN.data_numa_node' is the node or nodes that data intensive threads will run in and should be set to the same numa node that the GPU specified by the corresponding 'rankN.taskcalc_gpu' is on for best performance. If the 'rankN.taskcalc_gpu' is specified the 'rankN.data_numa_node' will be automatically set to the node the GPU is attached to, otherwise there will be no CPU thread affinity or preferred node for memory allocation if not specified or left empty. The node list may be a single node number or a range, E.G. 1-5,7,10.

General configuration¶

Configuration parameter	default value	description
protected_sets	MASTER,_MASTER,_DATASOURCE	Tables with these names will not be deleted (comma separated).
default_ttl	20	Time-to-live in minutes of non-protected tables before they are deleted.
memory_ttl	9999999	Time in minutes for an unused data tables to stay in memory.
disable_clear_all	false	Disallow the "clear" command to clear all tables.
point_render_threshold	100000	Threshold number of points (per-TOM) at which point rendering switches to fast mode.
symbology_render_threshold	10000	Threshold for the number of points (per-TOM) after which symbology rendering falls back to regular rendering
max_heatmap_size	3072	Maximum size (in pixels) heatmap that can be generated. This reserves MHSMHS8 bytes of GPU memory at rank0
enable_pinned_memory	false	Enable (if true) or disable pinned memory; disabling pinned memory uses malloc and will result in slower transfer times to the GPU, but allows more system memory to be used.
pinned_memory_pool_size	0	Size in bytes of the pinned memory pool (per-rank) (set to 0 to disable) (enable_pinned_memory must be false if using)
concurrent_kernel_execution	true	Enable (if true) multiple kernels to run concurrently on the same GPU
enable_gpu_allocator	true	Enable (if true) the GPU memory pre-allocator
max_gpu_memory	-1	Maximum amount of GPU memory to allocate (in bytes) (only works if enable_gpu_allocator is true) (set to -1 to allocate the maximum)
enable_gpu_caching	false	Enable (if true) caching data on the GPU (requires enable_gpu_allocator to be true)
fast_polling	false	Enable (if true) fast polling on the worker ranks (at the costof 100% cpu usage)
stripmine_num_points	0	Chunk size used by kernels that do stripmine processing of the attribute vectors. If set to 0 then the chunk size is set to the vector size, Effectively disabling stripmining.
max_get_records_size	20000	Maximum number of records that data retrieval requests (like /get/records, /aggregate/groupby, etc.) will return at one time
request_timeout	20	Timeout (in minutes) for filter-type requests
max_query_temp_memory	64000000000	Maximum amount of temporary memory (in bytes) a query can allocate (per-tom); if more is needed the query will fail
on_startup_script		An optional executable command that will be run once when Kinetica is ready for client requests. This can be used to perform any initialization logic that needs to be run before clients connect. It will be run as the gpudb user, so you must ensure that any required permissions are set on the file to allow it to be executed. If the command cannot be executed or returns an error code (!=0), then Kinetica will be stopped. Output from the startup script will be logged to /opt/gpudb/core/logs/gpudb-on-start.log (and its dated relatives). The gpudb_env.sh script is run directly before the command, so the path will be set to include the supplied python runtime. Example: on_startup_script = /home/gpudb/on-start.sh param1 param2 ...

Tomcat Configuration¶

Configuration parameter	default value	description
enable_tomcat	true

Text search configuration¶

Configuration parameter	default value	description
enable_text_search	true	Enable text search.
use_external_text_server	false	Use an external text server instead of an internal one. Be sure to update the rankN.text_index_address and rankN.text_search_address params below.
text_indices_per_rank	10	Number of text indices to start for each rank
text_searcher_refresh_interval	20	Searcher refresh intervals - specifies the maximum delay (in seconds) between writing to the lucene index and being able to search for the value just written. A value of 0 insures that writes to the index are immediately available to be searched. A more nominal value of 100 should improve ingest speed at the cost of some delay in being able to text search newly added values.
rank1.text_index_address rank2.text_index_address	ipc:///${gaia.temp_directory}/gpudb-text-index-1 ipc:///${gaia.temp_directory}/gpudb-text-index-2	External text server addresses to use if 'use_external_text_server = true'. Specify one for each worker rank N, where N ranges from 1 to 'number_of_ranks'. Add the appropriate number of rankN.text_index_address for each worker rank as needed. The addresses can be a fully qualified TCP address:port for remote servers or an IPC address for local text index servers. If no addresses are specified the text index servers will use IPC and be started on the machine where the rank is running as shown in the IPC example below. You should either specify all addresses or none to get the defaults. Example for remote or TCP servers: rank1.text_index_address = tcp://127.0.0.1:4000 rank2.text_index_address = tcp://127.0.0.1:4001 ... up to rank[number_of_ranks].text_index_address = ... Example for local IPC servers: rank1.text_index_address = ipc:///tmp/gpudb-text-index-1 rank2.text_index_address = ipc:///tmp/gpudb-text-index-2 ... up to rank[number_of_ranks].text_index_address = ... Where '/tmp/gpudb-text-index-1' is the name of the socket file to create.

Data persistence storage configuration and directories¶

Configuration parameter	default value	description
persist_directory	/opt/gpudb/persist	Specify a base directory to store persistence data files.
data_directory	${gaia.persist_directory}	Base directory to store data vectors.
object_directory	${gaia.persist_directory}	Base directory to store added objects.
sms_directory	${gaia.persist_directory}	Base directory to store hashed strings.
text_index_directory	${gaia.persist_directory}	Base directory to store the text search index.
temp_directory	/tmp	Directory for Kinetica to use to store temporary files. Must be a fully qualified path, have at least 100Mb of free space, and execute permission.
persist_encryption_pass_command		Path to script that echos the password for mounting encrypted persistence dirs. Note that all directories must be encrypted with the same password. Leave blank if the persistence directories are not encrypted or if the directories will be mounted before Kinetica starts.
persist_sync	false	Synchronous persistence file writing instead of asynchronous writing.
persist_sync_time	5	Force syncing the persistence files every N minutes if out of sync. Note that files are always opportunistically saved, this simply enforces a maximum time a file can be out of date. Set to a very high number to disable.
synchronous_update	true	If true then updates are applied to IndexDB synchronously (slower) If false then updates are asynchronous (faster but less safe)
load_vectors_on_start	always	load_vectors_on_start = always / lazy / on_demand
indexdb_toc_size	1000000	Table of contents size for IndexedDb object file store
indexdb_max_open_files	128	Maximum number of open files for IndexedDb object file store
sms_max_open_files	128	Maximum number of open files (per-TOM) for the SMS (string) store
chunk_size	8000000	Chunk size (0 disables chunking)
max_join_query_dimensions	2	Max join dimensions is the maximum number of tables in a joined table that can be accessed by a query and are not equated by a foreign-key to primary-key equality predicate
chunk_delete_enabled	true	Delete memory and disk storage for chunks whose size goes to 0

Stats logging configuration¶

Configuration parameter

default value

description

enable_stats_server

true

Run a stats server to collect information about Kinetica and servers it is running on.

stats_server_ip_address

stats_server_port

${gaia.rank0_ip_address}

2003

Stats server ip address (run on head node) default port is 2003

stats_server_namespace

gpudb

Status server namepace - should be a machine identifier

Procs¶

Configuration parameter	default value	description
enable_procs	true	Enable procs.
proc_directory	${gaia.temp_directory}	Directory where proc files are stored at runtime. Must be a fully qualified path with execute permission. If not specified, temp_directory will be used.
proc_data_directory	${gaia.temp_directory}	Directory where data transferred to and from procs is written. Must be a fully qualified path with sufficient free space for required volume of data. If not specified, temp_directory will be used.

HA¶

Configuration parameter

default value

description

enable_ha

false

At the moment, this section is simply to provide information about if and where HA is running, so that the gpudb startup script can properly create proxy-pass entries for httpd when both HA and httpd are enabled. Enable HA

ha_proc_ip_address

ha_proc_port

127.0.0.1

9191

The IP address and port of the machine running HA that is accessible from httpd - used to generate proxy-pass entries. Default of 127.0.0.1 because in most cases, HA-Proc will be running on the same machine as rank0.

Table Of Contents