The Kinetica HA eliminates the single point of failure in a standard Kinetica cluster and provides recovery from failure. It is implemented external to Kinetica to utilize multiple replicas of data and provides an eventually consistent data store.
Figure 1
The Kinetica HA solution is comprised of three sections: a Kinetica front end load balancer, a Distributed Messaging Queue (DMQ) and Kinetica HA components. The Kinetica front end load balancer (GPUdbLB) is composed of multiple Kinetica webservers in which each web server is either multihomed or connected to a switch which is multihomed (this assumes multiple redundant internet connections connecting to multiple switches). This prevents any network failure outside of the Kinetica HA solution from disrupting a user’s access to Kinetica’s services. Given this redundant connection, a high-level diagram of the Kinetica HA solution is shown in Figure 1.
From Figure 1, we can see that we run multiple Kinetica instances and that there is a 1-to-1 relationship of HAProc (the main HA Processing component) to Kinetica. This reduces the complexity of the HA solution and allows HAProc component to run on the same hardware as Kinetica. By doing this, we can make some assumptions that if the HAProc is down, then the associated Kinetica is down (This assumption simplifies the transaction processing and speeds up the HA solution). From the diagram above, we also see that the Distributed Messaging Queue (DMQ) is a separate component in the HA Solution. While this is logically the case, the DMQ processes can run on the same hardware as the HAProc and Kinetica servers, reducing the total amount of hardware needed to run this HA solution. Later in this document we will get into more details on how the HA solution processes each request.
Just like any software component, HAProc failure is possible due to hardware or software failure. When HAProc goes down, depending on what kind of failure it is, the Kinetica processes are either up or down. In either case though, since all access to Kinetica is thru the HAProc (See Fig 1), once HAProc goes down, the access to Kinetica is unavailable. During this time, any add/update requests coming from client will be failed over to the other HAProcs to be serviced by the other Kineticas. Each alive HA component, in addition to writing the client request data to its own Kinetica, will also write in a HA Msg Q (DMQ) the transaction logs. When the crashed HAProc comes back online, before it starts servicing request, it goes to the MsgQ and fetches all the transactions which were stored during its downtime, and replays them on its own Kinetica. For query transactions, we don’t need to write to DMQ. For all kinds of transactions, once the loadbalancer notices that a HAProc is down, the request is seamlessly failed over to other HAProcs which are alive. If the HAProc ever crashes while Kinetica is still up and running, an administrator should restart the HAProc process using the standard start script. HAProcs do not store any data locally and always writes to and reads from a high availability msg queue. Therefore there is no need to restore or backup any files on behalf of a HAProc component. To bring down HAProc for planned maintenance activity, first gracefully shutdown its companion Kinetica and then simply ‘Kill -9’ the HAProc process. Kinetica Failure and Recovery
Kinetica can become unavailable in a few different ways. Its companion HAProc going down (either planned or crash since all access to Kinetica is through HAProc), Kinetica being brought down for maintenance or Kinetica failure. Once Kinetica is down, its companion HAProc notices that fact (due to failure to communicate with Kinetica) and in many severe cases (for example getting back a connection refused) brings itself down. This way, the loadbalancer does not try to communicate to a HAProc which has lost its Kinetica. Kinetica stores data in memory while keeping a copy of its data on the disk. This allows Kinetica to be brought back up quickly with all its data populated once it goes down. During planned maintenance phase (when no data ingestion is expected), Kinetica can be brought down and the persistence directories can be backed up following standard in-house procedure for file backup. In a standard installation, the Kinetica directory is located at /opt/gpudb/persist but can be changed in the Kinetica configuration files. The entire directory specified as the location of the persistant data should be backed up. In summary, the Kinetica HA solution consists of 2 or more Kinetica HA components, an HA message queue for each Kinetica HA component, and a cluster of loadbalancers. Each Kinetica HA component consists of a HAProc and its companion Kinetica. They work in tandem to service client requests. Any one of the HA component pieces going down (planned or unplanned) will take down the HA component and will make it unavailable. During this time, the other HA components will still be actively servicing client requests and the failover from the failed HA component to the working ones will be transparent. Once the failed HA component is brought back up, the HAProc will transparently replay the transaction logs (stored in the message queues) to make itself consistent (thus the use of the word eventual consistency) and start serving the client.
A client starts out by sending a request to our GPUdbLoad Balancer, where the load balancing algorithm sends the request to a chosen HAProc (L). Depending on the request, the HAProc will take one of three actions:
While all of this is going on, consumers within each HAProc consume transactions from the Async Queue associated with each HAProc. This leads to our eventual consistency because the client receives a success after at least one synchronous/asynchronous call successfully completes, while the rest of the HAProcs commit those transactions to their Kinetica servers.
Also, in all of the above cases, if the load balancer contacts a downed HAProc, it will follow the “Next HAProc Path”, finding an available HAProc to execute the query on or return a failed result. Once the failed HAProc is brought back online, it will only inform the load balancer it is alive if the HAProc’s child Kinetica is up and running AND its queues are empty (this guarantees execution of all synchronous transactions). If the queues are not empty, the HAProcs consumers will process all transactions and will eventually empty the queue, allowing the main HAProc to inform the load balancer it is ready for requests.