Skip to main content
Dictionary encoding is a data compression technique that can be applied to individual columns. It will store each unique value of a column in memory and associate each record with its corresponding unique value. This eliminates the storage of duplicate values in a column, reducing the overall memory & disk space required to hold the data. Dictionary encoding is most effective on columns with low cardinality; the fewer the number of unique values within a column, the greater the reduction in memory usage. Queries against the encoded column will generally be faster. A column can be created with dictionary encoding in effect by applying the dict data handling property to the column during type creation (using /create/type). An existing column can be converted to use dictionary encoding by modifying the column and applying the dict property (using /alter/table). For example, to apply dictionary encoding to a column during table creation:
CREATE OR REPLACE TABLE example.employee
(
    id INTEGER NOT NULL,
    dept_id INTEGER NOT NULL,
    manager_id INTEGER,
    first_name VARCHAR(32),
    last_name VARCHAR(32, DICT),
    sal DECIMAL(18,4),
    hire_date DATE,
    PRIMARY KEY (id, dept_id),
    SHARD KEY (dept_id)
)
To apply dictionary encoding to a column after table creation:
ALTER TABLE example.employee
ALTER COLUMN last_name VARCHAR(32, DICT)
To remove dictionary encoding from a column:
ALTER TABLE example.employee
ALTER COLUMN last_name VARCHAR(32)
Don’t leave any spaces between properties in an /alter/table command’s column_properties option.

Limitations and Cautions

Columns with any of the following characteristics are not eligible for dictionary encoding: