Cluster Resiliency Configuration & Management

Cluster resiliency is enabled through the use of an active or spare node that has been marked as accepting failover processes. As noted on Cluster Resiliency Overview, spare nodes are typically configured during the installation process but can also be added later using the Cluster Management Interface. Active nodes in the cluster that do not currently accept failover processes can be updated using KAgent as well.

Tip

If you installed your Kinetica 7.1 cluster within a Microsoft Azure environment, KAgent will automatically handle volume provisioning and management.

Configuration

All general settings for failover / switchover protocol can be edited using KAgent. To access these settings:

  1. Navigate to KAgent (http://<kagent-host>:8081)
  2. Click Manage.
  3. If there are multiple rings, click Clusters next to the desired ring. Otherwise, skip to the next step.
  4. Click Manage next to the desired cluster.

Failover Policy

Failover policy flags determine if the head node and/or worker nodes fail over to another node when a node failure occurs. These policy settings are available in KAgent via the Failover Configuration tab in the Nodes management interface. The policies and their settings are as follows:

  • Enable Head Failover
    • X -- The head node will not fail over if any kind of process failure occurs
    • -- The head node will fail over. See Cluster Resiliency Overview for details.
  • Enable Worker Failover
    • X -- The worker node will not fail over if any kind of process failure occurs
    • -- The worker node will fail over. See Cluster Resiliency Overview for details.

To update these settings:

  1. From the Nodes tab in the Manage interface for a cluster, click Failover Configuration.
  2. In the top section, Cluster Failover, adjust the sliders as necessary.
  3. Click Update.
../cluster_failover_settings.png

Data Loading Policy

The data loading policies determine when data is loaded and when primary keys and indexes are built on database startup and/or after a failover has occurred. These policies can affect how long it can take before the failed over process is operational on the new node. There are three options available for each policy:

  • Always -- Load as much of the stored data as possible into memory before accepting requests
  • Lazy Load -- Load the necessary data to start, and load as much of the remainder of the stored data as possible into memory lazily
  • On-demand Load --Only load data as requests use it

To update these settings:

  1. From the Nodes tab in the Manage interface for a cluster, click Failover Configuration.
  2. In the bottom section, Data Loading Policy, adjust the four policies as necessary.
  3. Click Update.
../data_loading_policy.png

Management

Spare nodes and existing hosts are configured for cluster resiliency using KAgent. Switchover can also be triggered from KAgent.

Spare Nodes

Spare nodes are used in the event of a node failover (review Cluster Resiliency Overview for details). Spare nodes can also be added to a cluster during the installation process. Note that spare nodes added via installation accept processes failing over by default. Review cluster resiliency considerations prior to adding a spare node but note that these considerations vary slightly for Azure setups.

To add spare nodes or additional workers using KAgent after installation:

  1. From the Nodes tab in the Manage interface for a cluster, click List.
  2. Click Add Spare.
  3. Click Add New Node as many times as nodes that should be added to the cluster.
  4. Provide a custom Label (hostname is suggested), Internal IP address, and Public IP address for each node.
  5. Click Add. Confirm the addition.

The spare node is added to the cluster but does not accept failover by default. See Existing Host Management for instructions on changing this behavior.

../add_spare_worker.png

Existing Host Management

If a host was not initially marked for accepting failover or had previously failed over and is now fixed, the host can be marked to accept failover processes again via KAgent. To change failover acceptance for nodes in a given cluster:

  1. From the Nodes tab in the Manage interface for a cluster, click Failover Configuration.
  2. In the middle section, Node Accept Failover, adjust the sliders as necessary for each node in the cluster.
  3. Click Update.
../node_accept_failover.png

Triggering Switchover

To trigger a Switchover from one node to another via KAgent:

  1. From the Nodes tab in the Manage interface for a cluster, click Switchover.
  2. Select a node in the cluster to be switched from in the Source Node drop-down menu.
  3. Select a node in the cluster to be switched to in the Target Node drop-down menu.