Overview
Within Resource Management, there are three main areas of configuration that take place while the database is offline: For configuration that can be performed while the database is online, see Resource Management Usage.Tiered Storage
All tiers must be defined while the database is offline, in /opt/gpudb/core/etc/gpudb.conf, though the capacity and high & low watermarks within each tier can be altered while the database is operational.The VRAM Tier pre-allocates memory upon system startup, and
this capacity cannot be altered while the database is running.
| Tier | Total Instances | Description |
|---|---|---|
| VRAM | 1 | Video RAM; present in all CUDA-based installations |
| RAM | 1 | Main memory; present in all installations |
| Disk | 0..N | Optional disk-based cache; required for caching non-persisted database objects |
| Persist | 1 | Primary disk-based data store; present in all installations |
| Cold Storage | 0..N | Optional secondary disk-based data store |
Global Tier Parameters
The only parameter that spans all tiers is the tier.global.concurrent_wait_timeout. This is the time, in seconds, that a request should wait for a dependent resource to load that is currently being loaded to fulfill a separate request. This timeout should be adjusted to cover the expected wait time for the slowest data source in the system. If the slowest tier were the Persist Tier, a reasonable timeout may be expected to be much smaller than if the slowest tier were a Cold Storage Tier with longer transfer times.Tier-Specific Parameters
For tier-specific parameters, the general format for defining the tier is:vram- VRAM Tier (GPU memory)ram- RAM Tier (Main memory)disk- Disk Cache Tier (Disk cache)persist- Persist Tier (Permanent storage)cold- Cold Storage Tier (Extended long-term storage)
default applies the parameter’s value to all
ranks, system wide, for the respective tier. To override that default value
for a specific rank, the config_level should be rank followed
by the index of the rank to modify.
The only tier with a config_sublevel is the VRAM Tier, which
uses all_gpus. This simply applies the default values and rank overrides to
all GPU devices and acts as a placeholder for future use.
While the complete set of valid parameter names varies between
tiers, most (except the Cold Storage Tier) will have the following common
ones:
- limit - the maximum capacity of the tier
- high_watermark - the high watermark, as a percentage, to use for watermark-based eviction
- low_watermark - the low watermark, as a percentage, to use for watermark-based eviction
To disable watermark-based eviction, set high_watermark &
low_watermark to
100. This can be done for an entire tier,
using default, or for individual ranks within a tier by specifying
the rank index (rank[#]). Also setting a tier’s size limit to -1
will implicitly disable watermark-based eviction; except in the case of the
VRAM Tier, where it will reserve 95% of video RAM and use the watermark
settings for eviction.VRAM Tier
The default configuration assigns the VRAM Tier 95% of VRAM and high & low watermarks of 90% & 50%, respectively. A limit of-1 reserves 95% of available video RAM for usage by
the VRAM Tier. To set to a specific amount of VRAM, change this setting to be
the number of bytes of VRAM to use. The high_watermark &
low_watermark values are in percentages of the limit.
RAM Tier
The default configuration assigns the RAM Tier no size limit and disables watermark-based eviction.It is recommended to give the RAM Tier a size limit, as this
allows other processes on the host to use the remainder of the RAM and also
enables database object eviction to the Disk Cache Tier and Persist Tier.
-1 assigns no RAM size limit for usage by the
RAM Tier. To set to a specific amount of RAM, change this setting to be
the number of bytes of RAM to use. The high_watermark &
low_watermark values are in percentages of the limit.
Disk Cache Tier
The default configuration defines no Disk Cache Tier instances. To define a Disk Cache Tier, first ensure that a RAM Tier limit is set to a positive number that is the number of bytes of RAM to use. Without this being set, no eviction from the RAM Tier will take place, and the Disk Cache Tier will be unused. Next, define a Disk Cache Tier instance. The instance should be identified by the keyworddisk, followed by a digit, 0 - 9. The path
parameter needs to be set to a directory/mount point that exists on all
Kinetica hosts and has read/write access for the gpudb user. A
limit of -1 assigns no disk size limit for usage by the
Disk Cache Tier. To set to a specific amount of disk space to use, change
this setting to be the number of bytes of disk to use.
The high_watermark & low_watermark values are in
percentages of the limit.
Lastly, the Disk Cache Tier can be configured to store persistent objects.
If the disk backing the Disk Cache Tier is more performant than the disk
backing the Persist Tier, set store_persistent_objects to true
to cache persistent objects in the Disk Cache Tier. If not more performant
(especially if the disks backing the Disk Cache Tier & Persist Tier are the
same disk), set store_persistent_objects to false to bypass the
Disk Cache Tier when writing persistent objects to disk.
For instance, to create a 500GB Disk Cache Tier instance with ID disk1,
using a mount point of /opt/gpudb/diskcache,
watermark-based eviction thresholds of 90% & 80%, and caching of persistent
objects:
It is recommended to give each Disk Cache Tier instance a size
limit, as this allows other processes on the host to use the remainder of the
disk capacity and also enables database object eviction to the
Persist Tier.
Persist Tier
The default configuration assigns the Persist Tier no size limit and disables watermark-based eviction. It also saves database column & object data to the default Kinetica persistence directory, /opt/gpudb/persist.The Persist Tier should be given a size limit if any
Cold Storage Tier instances are to be used to back it, as database object
eviction from the Persist Tier to cold storage can only take place if the
Persist Tier has a defined limit.
-1 assigns no disk size limit for
usage by the Persist Tier. To set to a specific amount of disk space, change
this setting to be the number of bytes of disk to use. The
high_watermark & low_watermark values are in percentages
of the limit.
Cold Storage Tier
The default configuration defines no Cold Storage Tier instances. To define a Cold Storage Tier, first ensure that the Persist Tier has a limit set to a positive number that is the number of bytes of disk space to use. Without this being set, no eviction from the Persist Tier will take place, and the Cold Storage Tier will be unused. Next, define the Cold Storage Tier instance. The instance should be identified by the keywordcold, followed by a digit, 0 - 9. The
type parameter should be set to one of the following, with a
corresponding base_path:
disk(local/network storage)hdfs(Hadoop File System)azure(Azure blob storage)s3(Amazon S3)
cold1, using a mount point of /opt/gpudb/cold:
cold2, with a URI of
hdfs://localhost:9000, an HDFS user named kinetica, a root HDFS path of
/gpudb/cold, and using Kerberos with a keytab file path of
/opt/gpudb/krb5.keytab:
cold3,
having a container name of kinetica, a storage account name & key of
name & key, respectively, and a root path of /gpudb/cold:
cold4, having a
bucket name of kinetica, an S3 Access Key ID & Secret Access Key of key
& secret, respectively, and a root S3 path of /gpudb/cold:
If not supplying the
s3_aws_access_key_id or
s3_aws_secret_access_key parameter values via the
/opt/gpudb/core/etc/gpudb.conf file, the values should instead
be provided via the AWS CLI
or via the respective AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY
environment variables.Default Resource Group
The default resource group is applied to all users who do not have one set explicitly. The default parameters can be modified in /opt/gpudb/core/etc/gpudb.conf and are described in the Configuration Reference under Default Resource Group.Tier Strategy
There are two configuration parameters in /opt/gpudb/core/etc/gpudb.conf described in the Configuration Reference under Tier Strategy that relate to tier strategy:Default Tier Strategy
The tier_strategy.default parameter is the default tier strategy that is applied to all tables and columns that do not have a tier strategy set explicitly during table creation or have had it subsequently altered. A default tier strategy can include a single Disk Cache Tier instance to enable the caching to disk of all database objects, including non-persistent ones like filtered views, join views, memory-only tables, as well as those that are created temporarily to service user requests. Without a Disk Cache Tier in the default tier strategy, these other non-persistent objects will not be able to be moved out of RAM to allow for other data that may be required to support the processing of higher-priority requests. The strategy can also include a Cold Storage Tier instance to allow database objects to be moved off of local disks, making room for other data. Without a Cold Storage Tier in the default tier strategy, each user-created persisted table will have to be designated, during creation or modification, as able to be moved off of local disks. An example default tier strategy configuration is:Tier Strategy Predicate Evaluation Interval
Predicates can be used within a tier strategy to distribute & prioritize table and column data among the defined tiers. Those predicates are reevaluated at an interval to determine whether it is necessary to move data to a different tier. The two cases where reevaluation will have an impact are:-
The predicate contains a function that is not data-based, like
NOW(), which may be used to keep current data within a given tier. For instance, to keep in RAM all records which have been modified in the last hour, the RAM Tier could have a predicate like: - The predicate references a column whose values have been updated since the last evaluation interval