Class AdminRebalanceRequest

  • All Implemented Interfaces:
    org.apache.avro.generic.GenericContainer, org.apache.avro.generic.IndexedRecord

    public class AdminRebalanceRequest
    extends Object
    implements org.apache.avro.generic.IndexedRecord
    A set of parameters for GPUdb.adminRebalance.

    Rebalance the data in the cluster so that all nodes contain an equal number of records approximately and/or rebalance the shards to be equally distributed (as much as possible) across all the ranks.

    The database must be offline for this operation, see GPUdb.adminOffline

    * If GPUdb.adminRebalance is invoked after a change is made to the cluster, e.g., a host was added or removed, sharded data will be evenly redistributed across the cluster by number of shards per rank while unsharded data will be redistributed across the cluster by data size per rank

    * If GPUdb.adminRebalance is invoked at some point when unsharded data (a.k.a. randomly-sharded) in the cluster is unevenly distributed over time, sharded data will not move while unsharded data will be redistributed across the cluster by data size per rank

    NOTE: Replicated data will not move as a result of this call

    This endpoint's processing time depends on the amount of data in the system, thus the API call may time out if run directly. It is recommended to run this endpoint asynchronously via GPUdb.createJob.

    • Constructor Detail

      • AdminRebalanceRequest

        public AdminRebalanceRequest()
        Constructs an AdminRebalanceRequest object with default parameters.
      • AdminRebalanceRequest

        public AdminRebalanceRequest​(Map<String,​String> options)
        Constructs an AdminRebalanceRequest object with the specified parameters.
        Parameters:
        options - Optional parameters.
        • REBALANCE_SHARDED_DATA: If TRUE, sharded data will be rebalanced approximately equally across the cluster. Note that for clusters with large amounts of sharded data, this data transfer could be time consuming and result in delayed query responses. Supported values: The default value is TRUE.
        • REBALANCE_UNSHARDED_DATA: If TRUE, unsharded data (a.k.a. randomly-sharded) will be rebalanced approximately equally across the cluster. Note that for clusters with large amounts of unsharded data, this data transfer could be time consuming and result in delayed query responses. Supported values: The default value is TRUE.
        • TABLE_INCLUDES: Comma-separated list of unsharded table names to rebalance. Not applicable to sharded tables because they are always rebalanced. Cannot be used simultaneously with TABLE_EXCLUDES. This parameter is ignored if REBALANCE_UNSHARDED_DATA is FALSE.
        • TABLE_EXCLUDES: Comma-separated list of unsharded table names to not rebalance. Not applicable to sharded tables because they are always rebalanced. Cannot be used simultaneously with TABLE_INCLUDES. This parameter is ignored if REBALANCE_UNSHARDED_DATA is FALSE.
        • AGGRESSIVENESS: Influences how much data is moved at a time during rebalance. A higher AGGRESSIVENESS will complete the rebalance faster. A lower AGGRESSIVENESS will take longer but allow for better interleaving between the rebalance and other queries. Valid values are constants from 1 (lowest) to 10 (highest). The default value is '10'.
        • COMPACT_AFTER_REBALANCE: Perform compaction of deleted records once the rebalance completes to reclaim memory and disk space. Default is TRUE, unless REPAIR_INCORRECTLY_SHARDED_DATA is set to TRUE. Supported values: The default value is TRUE.
        • COMPACT_ONLY: If set to TRUE, ignore rebalance options and attempt to perform compaction of deleted records to reclaim memory and disk space without rebalancing first. Supported values: The default value is FALSE.
        • REPAIR_INCORRECTLY_SHARDED_DATA: Scans for any data sharded incorrectly and re-routes the data to the correct location. Only necessary if GPUdb.adminVerifyDb reports an error in sharding alignment. This can be done as part of a typical rebalance after expanding the cluster or in a standalone fashion when it is believed that data is sharded incorrectly somewhere in the cluster. Compaction will not be performed by default when this is enabled. If this option is set to TRUE, the time necessary to rebalance and the memory used by the rebalance may increase. Supported values: The default value is FALSE.
        The default value is an empty Map.
    • Method Detail

      • getClassSchema

        public static org.apache.avro.Schema getClassSchema()
        This method supports the Avro framework and is not intended to be called directly by the user.
        Returns:
        The schema for the class.
      • getOptions

        public Map<String,​String> getOptions()
        Optional parameters.
        • REBALANCE_SHARDED_DATA: If TRUE, sharded data will be rebalanced approximately equally across the cluster. Note that for clusters with large amounts of sharded data, this data transfer could be time consuming and result in delayed query responses. Supported values: The default value is TRUE.
        • REBALANCE_UNSHARDED_DATA: If TRUE, unsharded data (a.k.a. randomly-sharded) will be rebalanced approximately equally across the cluster. Note that for clusters with large amounts of unsharded data, this data transfer could be time consuming and result in delayed query responses. Supported values: The default value is TRUE.
        • TABLE_INCLUDES: Comma-separated list of unsharded table names to rebalance. Not applicable to sharded tables because they are always rebalanced. Cannot be used simultaneously with TABLE_EXCLUDES. This parameter is ignored if REBALANCE_UNSHARDED_DATA is FALSE.
        • TABLE_EXCLUDES: Comma-separated list of unsharded table names to not rebalance. Not applicable to sharded tables because they are always rebalanced. Cannot be used simultaneously with TABLE_INCLUDES. This parameter is ignored if REBALANCE_UNSHARDED_DATA is FALSE.
        • AGGRESSIVENESS: Influences how much data is moved at a time during rebalance. A higher AGGRESSIVENESS will complete the rebalance faster. A lower AGGRESSIVENESS will take longer but allow for better interleaving between the rebalance and other queries. Valid values are constants from 1 (lowest) to 10 (highest). The default value is '10'.
        • COMPACT_AFTER_REBALANCE: Perform compaction of deleted records once the rebalance completes to reclaim memory and disk space. Default is TRUE, unless REPAIR_INCORRECTLY_SHARDED_DATA is set to TRUE. Supported values: The default value is TRUE.
        • COMPACT_ONLY: If set to TRUE, ignore rebalance options and attempt to perform compaction of deleted records to reclaim memory and disk space without rebalancing first. Supported values: The default value is FALSE.
        • REPAIR_INCORRECTLY_SHARDED_DATA: Scans for any data sharded incorrectly and re-routes the data to the correct location. Only necessary if GPUdb.adminVerifyDb reports an error in sharding alignment. This can be done as part of a typical rebalance after expanding the cluster or in a standalone fashion when it is believed that data is sharded incorrectly somewhere in the cluster. Compaction will not be performed by default when this is enabled. If this option is set to TRUE, the time necessary to rebalance and the memory used by the rebalance may increase. Supported values: The default value is FALSE.
        The default value is an empty Map.
        Returns:
        The current value of options.
      • setOptions

        public AdminRebalanceRequest setOptions​(Map<String,​String> options)
        Optional parameters.
        • REBALANCE_SHARDED_DATA: If TRUE, sharded data will be rebalanced approximately equally across the cluster. Note that for clusters with large amounts of sharded data, this data transfer could be time consuming and result in delayed query responses. Supported values: The default value is TRUE.
        • REBALANCE_UNSHARDED_DATA: If TRUE, unsharded data (a.k.a. randomly-sharded) will be rebalanced approximately equally across the cluster. Note that for clusters with large amounts of unsharded data, this data transfer could be time consuming and result in delayed query responses. Supported values: The default value is TRUE.
        • TABLE_INCLUDES: Comma-separated list of unsharded table names to rebalance. Not applicable to sharded tables because they are always rebalanced. Cannot be used simultaneously with TABLE_EXCLUDES. This parameter is ignored if REBALANCE_UNSHARDED_DATA is FALSE.
        • TABLE_EXCLUDES: Comma-separated list of unsharded table names to not rebalance. Not applicable to sharded tables because they are always rebalanced. Cannot be used simultaneously with TABLE_INCLUDES. This parameter is ignored if REBALANCE_UNSHARDED_DATA is FALSE.
        • AGGRESSIVENESS: Influences how much data is moved at a time during rebalance. A higher AGGRESSIVENESS will complete the rebalance faster. A lower AGGRESSIVENESS will take longer but allow for better interleaving between the rebalance and other queries. Valid values are constants from 1 (lowest) to 10 (highest). The default value is '10'.
        • COMPACT_AFTER_REBALANCE: Perform compaction of deleted records once the rebalance completes to reclaim memory and disk space. Default is TRUE, unless REPAIR_INCORRECTLY_SHARDED_DATA is set to TRUE. Supported values: The default value is TRUE.
        • COMPACT_ONLY: If set to TRUE, ignore rebalance options and attempt to perform compaction of deleted records to reclaim memory and disk space without rebalancing first. Supported values: The default value is FALSE.
        • REPAIR_INCORRECTLY_SHARDED_DATA: Scans for any data sharded incorrectly and re-routes the data to the correct location. Only necessary if GPUdb.adminVerifyDb reports an error in sharding alignment. This can be done as part of a typical rebalance after expanding the cluster or in a standalone fashion when it is believed that data is sharded incorrectly somewhere in the cluster. Compaction will not be performed by default when this is enabled. If this option is set to TRUE, the time necessary to rebalance and the memory used by the rebalance may increase. Supported values: The default value is FALSE.
        The default value is an empty Map.
        Parameters:
        options - The new value for options.
        Returns:
        this to mimic the builder pattern.
      • getSchema

        public org.apache.avro.Schema getSchema()
        This method supports the Avro framework and is not intended to be called directly by the user.
        Specified by:
        getSchema in interface org.apache.avro.generic.GenericContainer
        Returns:
        The schema object describing this class.
      • get

        public Object get​(int index)
        This method supports the Avro framework and is not intended to be called directly by the user.
        Specified by:
        get in interface org.apache.avro.generic.IndexedRecord
        Parameters:
        index - the position of the field to get
        Returns:
        value of the field with the given index.
        Throws:
        IndexOutOfBoundsException
      • put

        public void put​(int index,
                        Object value)
        This method supports the Avro framework and is not intended to be called directly by the user.
        Specified by:
        put in interface org.apache.avro.generic.IndexedRecord
        Parameters:
        index - the position of the field to set
        value - the value to set
        Throws:
        IndexOutOfBoundsException
      • hashCode

        public int hashCode()
        Overrides:
        hashCode in class Object