Overview
Kinetica provides two means of backing up data:- Database Backup - full, incremental, & differential data hot backup
- System Backup - file-based full system backup
Database Backup
SQL commands can be used to initiate hot backups, with full, incremental, & differential snapshots, and restorations of schema objects & data within the database.| Objects Backed Up | Objects Not Backed Up |
|---|---|
| * Credentials * Data Sinks * Data Sources * Resource Groups * Roles * SQL Procedures * SQL-GPT Contexts * Streams * Tables * UDFs * UDF Environments * Users * Views | * Graphs * KiFS Files * ML Models/Containers * Symbols |
- For the set of SQL commands for database backup, see Database Backup/Restore.
- For a full system backup, see System Backup.
Snapshot Types
Three types of snapshots are supported for database objects & data:- full - snapshot of the given database objects & data
- incremental - snapshot of the changes in the database objects & data since the last snapshot of any kind
- differential - snapshot of the changes in the database objects & data since the last full snapshot
Backup Storage
Database backup files will be transferred to the target specified in the given data sink. There, they will be stored under two levels of directories: the top-level directory will be the name of the database backup and the subdirectory will be the timestamp the snapshot was taken; e.g.:Backup Use Case
A typical usage of the backup feature is:- create a backup, taking an initial full snapshot
- schedule iterative incremental or differential snapshots
- restore a backup
Initial Backup
To create the initial backup, run a CREATE BACKUP statement, in SQL, that specifies:- the name to use for the backup—the backed-up database object set
- the data sink that will be used to transfer the backed-up files to the remote store (e.g., s3)
- the set of database objects to back up
daily_backup- name of the backupbackup_ds- data sink targeting the remote file serviceexample_backup- name of the schema to back up
Create Initial Backup Example
Schedule Iterative Snapshots
To schedule iterative snapshots after the initial backup is done, create a SQL procedure that specifies:- the name of the backed-up database object set (same as the initial backup)
- the data sink that will be used to transfer the snapshots to the remote store (same as the initial backup)
- the schedule for running the incremental snapshots
daily_backup- name of the backup to which snapshots will be addedbackup_ds- data sink targeting the remote file service1 DAY- daily snapshot intervalSTARTING AT...2025-01-01- starting at a date in the past causes the first snapshot to be taken at the next possible time intervalSTARTING AT...00:00:00- schedule the snapshot to be taken at midnight
Schedule Iterative Snapshots Example
Restore Backup
To restore database objects and table data from the latest snapshot in a backup, using the following parameters:daily_backup- name of the backup to restorerestore_ds- data source targeting the remote file serviceexample_backup- name of the schema to restorereplace- any exising database object will be overwritten by its counterpart from the backup
Restore Backup Example
System Backup
KAgent can be used to simplify the processes of backing up and restoring the Kinetica database. It is distributed separately from the database and can be installed and used to configure a Kinetica cluster following the instructions under Kinetica Installation with KAgent. There are two interfaces to KAgent for backing up a Kinetica cluster:- Graphical User Interface
- Command-Line Interface (described below)
Prerequisites
KAgent backup management has two requirements:- KAgent installed, with access to the cluster being managed
- A properly configured Kinetica cluster
Kinetica does not need to be offline to be backed up or restored.
The backup process will put Kinetica into read-only mode,
however, which will block operations requiring disk write access
(table creation/modification, persist-backed ingestion, etc.).
Backing Up
All the data in Kinetica can be backed up in either an ad-hoc or scheduled fashion, using the command line. To learn about how to back up Kinetica using the KAgent GUI, consult Backups. Backups are stored local to each node in the cluster. Those local backup file target directories can be mounted via NFS or similar external shared storage to consolidate those files to a single device. If multiple clusters are backed up to the same shared storage under the same backup directory, those backups will be able to be restored to any of the clusters in the group; e.g., if Cluster A and Cluster B are both backed up to the same shared location, the backups of Cluster A can be restored to Cluster B and vice versa. The base command for creating a backup:Schedule
There are three options for schedule:-
now— Runs a single backup right now without creating or modifying the schedule; (default behavior) -
A crontab schedule quoted string — Will overwrite any existing
backup schedule with the one specified.
For example,
'0 0 1 1-3 *'will schedule backups at 12:00 AM on the 1st day of each month, January through March. Consult the crontab documentation for details on schedule specification format. -
never— Clears the current backup schedule for the given cluster name
Backup Path
The backup path should be any valid file path on the Kinetica cluster nodes. If the directory does not exist on one or more nodes, KAgent will create it. The default backup path is/opt/backups.
Under this backup path directory, KAgent will create a subdirectory with the
name of the cluster as the directory name. KAgent will then create a snapshot
subdirectory under the cluster-specific subdirectory, named with the date/time
at which the backup was initiated, into which all backup files will be placed.
For instance, given the following backup command execution, run at 12:34:56 on
January 2nd, 2019:
Examples
To list backups scheduled for themycluster cluster:
mycluster cluster in the (default)
/opt/backups directory:
/opt/backup for the cluster named
mainkincluster scheduled for 22:00 on day 1 through 5 of every week:
mycluster cluster:
Listing
The backups available to be restored to a given cluster can be listed via command line or the KAgent GUI. See Snapshots for details on how to display a list of backups in the GUI. The base command for listing backups available to a given cluster:Backup Path
The backup path should be the file path on each Kinetica cluster node that contains the backups to list. The default backup path is/opt/backups.
KAgent will look on each cluster node for the directory named in the --backup-path
parameter and list the contents of that directory.
For instance, given the following restore command execution:
/opt/backup, on each
node of the cluster mycluster.
Examples
To list all backups available to the clustermycluster:
/opt/backups directory:
clusterA, under a shared backup
directory of /opt/backup:
/opt/backup:
That if multiple clusters are backed up to the same shared location,
all clusters’ backups will be listed, allowing for later restoration of the
targeted cluster from any of the other clusters’ backups.
Restoring
Backups made through KAgent are restored through KAgent, either via command line or the KAgent GUI. See Snapshots for details on restoring from snapshots using the GUI. The base command for restoring from backup:Restore From
The--restore-from parameter specifies which backup snapshot should be
restored. It should be the path to the existing snapshot to restore from,
including the directory that is the name of the cluster from which the backup
was taken:
Backup Path
The backup path should be the file path on each Kinetica cluster node that contains the backups to restore. The default backup path is/opt/backups.
KAgent will look on each cluster node in the directory given in the
--backup-path parameter for the directory named in the --restore-from
parameter. If found, KAgent will restore the database on each node from its
corresponding local snapshot.
For instance, given the following restore command execution:
Examples
To restore a backup of themycluster cluster in the (default)
/opt/backups directory that was initiated on January 2nd, 2019 at
12:34:56:
/tmp/kinetica-backups/ for the cluster named
mainkincluster that was initiated on January 2nd, 2019 at 01:23:45:
clusterA, to another cluster,
clusterB, using a backup in the (default) /opt/backups directory:
Deleting Backups
Backups can be deleted manually from their locations in the cluster’s snapshot directory. The command for deleting a backup:mycluster cluster in the
/opt/backups directory that was initiated on January 2nd, 2019 at
12:34:56, run the following command on each node in the cluster: