Version:

/admin/rebalance

URL: http://GPUDB_IP_ADDRESS:GPUDB_PORT/admin/rebalance

Rebalance the cluster so that all the nodes contain approximately an equal number of records. The rebalance will also cause the shards to be equally distributed (as much as possible) across all the ranks.

This endpoint may take a long time to run, depending on the amount of data in the system. The API call may time out if run directly. It is recommended to run this endpoint asynchronously via /create/job.

Input Parameter Description

Name Type Description
options map of string to strings

Optional parameters. The default value is an empty map ( {} ).

Supported Parameters (keys) Parameter Description
rebalance_sharded_data

If true, sharded data will be rebalanced approximately equally across the cluster. Note that for big clusters, this data transfer could be time consuming and result in delayed query responses. The default value is true. The supported values are:

  • true
  • false
rebalance_unsharded_data

If true, unsharded data (data without primary keys and without shard keys) will be rebalanced approximately equally across the cluster. Note that for big clusters, this data transfer could be time consuming and result in delayed query responses. The default value is true. The supported values are:

  • true
  • false
table_whitelist Comma-separated list of unsharded table names to rebalance. Not applicable to sharded tables because they are always balanced in accordance with their primary key or shard key. Cannot be used simultaneously with table_blacklist.
table_blacklist Comma-separated list of unsharded table names to not rebalance. Not applicable to sharded tables because they are always balanced in accordance with their primary key or shard key. Cannot be used simultaneously with table_whitelist.
aggressiveness Influences how much data to send per rebalance round. A higher aggressiveness setting will complete the rebalance faster. A lower aggressiveness setting will take longer, but allow for better interleaving between the rebalance and other queries. Allowed values are 1 through 10. The default value is '1'.
compact_after_rebalance

Perform compaction of deleted records once the rebalance completes, to reclaim memory and disk space. Default is true, unless repair_incorrectly_sharded_data is set to true. The default value is true. The supported values are:

  • true
  • false
compact_only

Only perform compaction, do not rebalance. Default is false. The default value is false. The supported values are:

  • true
  • false
repair_incorrectly_sharded_data

Scans for any data sharded incorrectly and re-routes the correct location. This can be done as part of a typical rebalance after expanding the cluster, or in a standalone fashion when it is believed that data is sharded incorrectly somewhere in the cluster. Compaction will not be performed by default when this is enabled. This option may also lengthen rebalance time, and increase the memory used by the rebalance. The default value is false. The supported values are:

  • true
  • false

Output Parameter Description

The GPUdb server embeds the endpoint response inside a standard response structure which contains status information and the actual response to the query. Here is a description of the various fields of the wrapper:

Name Type Description
status String 'OK' or 'ERROR'
message String Empty if success or an error message
data_type String 'admin_rebalance_request' or 'none' in case of an error
data String Empty string
data_str JSON or String

This embedded JSON represents the result of the /admin/rebalance endpoint:

Name Type Description
info map of string to strings Additional information.

Empty string in case of an error.