A primary key index is created by default when a table is
created with a primary key specified. It will match the
primary key in nature, in that it will be a single-column index if the
primary key is on a single column and a multi-column composite index if the
primary key is a composite primary key.
The primary key index is hash-based and optimizes the performance of
equality-based filter expressions (e.g. (x = 0)). Note that if the index
is a composite index, the filter expressions employed must reference all
of the primary key columns to take advantage of the performance gains of the
index.
Operations which make use of a primary key index include:
The entire primary key index for a table is loaded into memory when that table
is accessed, and depending on the size of the key columns and record count, this
could have a large impact on available RAM. The primary key index can instead
be stored solely on disk, saving RAM, but reducing some performance, by setting
the primary_key_type to disk when creating the table.
Relational Index
A relational index is created as the result of applying a
foreign key to a column. A description of the index and
its requirements can be found under Foreign Keys.
Tip
In SQL, foreign keys can be applied during table creation, via
CREATE TABLE, as well as afterwards, via
ALTER TABLE.
Column Index
A column index (also known as an attribute index) can be applied to a column
in a table to improve the performance of operations applied to that
column in an expression.
The column index is implemented as a b-tree, which provides performance
improvements for both equality-based (e.g. (x = 0)) and range-based (e.g.
(x > 0)) filter criteria on individual columns. Multi-column and
function-based indexes are not supported at this time.
If the column index will be indexing a column with low cardinality data
(e.g., number of unique values * 2 < total number of records) then using a
low cardinality index should use less
memory and still give the same performance improvements.
As primary key indexes only optimize the performance of equality-based filter
criteria, a column index can be applied to a primary key column (or each of
the columns in a composite primary key) for a performance boost on range-based
filter criteria--the two index types are not mutually exclusive.
Column indexes optimize the performance of the following operations, when
those operations are given equality-based or range-based filter expressions:
To apply a column index to a column, the /alter/table endpoint
should be called with the create_index action. The default index type is
the column index, so no index_type option is necessary to specify.
A column index can be applied to any table column
specific typeexcept:
wkt
json
array of any type
A column index can be applied to tables as well as to
materialized views that do not have a filter or
join as the last operation in their creation.
Low-Cardinality Index
A low-cardinality index is a column index
intended for columns with low cardinality--those with many duplicate values.
The less distinct values in a column are, the less memory this index will use.
It is recommended to use a low-cardinality index when the number of distinct
values in a column is less than half the total number of values in the column.
To apply a low-cardinality index to a column, the
/alter/table endpoint should be called with the
create_index action and the low_cardinality index type.
A chunk skip index can be applied to a column in a table to improve the
performance of equality-based filtering expressions applied to that column. The
primary use for this index is with partitioned tables,
where the filter is applied to one or more of the partition key columns. The
larger the data set and the greater the number of partitions generated, the more
performant filters on the table will be when the index is applied.
A chunk skip index can be applied to any table column with a
specific type matching any of the following:
int
long
ulong
charN
date
time
datetime
timestamp
A chunk skip index can be applied to tables as well as to
materialized views that do not have a filter or
join as the last operation in their creation.
Geospatial Index
A geospatial index can be applied to one or more columns in a table to
improve the performance of geospatial functions applied to them.
A CAGRA (Cuda Anns GRAph-based) index can be applied to a
vector column in a table to improve the
performance of vector searches applied to that
column.
A CAGRA index is not automatically maintained and must be refreshed manually
to reflect the latest set of records in the associated table.
An HNSW (Hierarchical Navigable Small Worlds) index can be applied to a
vector column in a table to improve the
performance of vector searches applied to that
column.
An HNSW index is automatically maintained and doesn't need to be refreshed
manually to reflect the latest set of records in the associated table.