Database index

Unlock the power of efficient data retrieval with database indexing. A crucial yet often overlooked aspect of database management, indexing can dramatically improve your system's performance. Let's delve into its intricacies and understand how it can revolutionize your data handling.

Bitmap index

A special kind of index that stores the bulk of its data as bit arrays (bitmaps) and answers most queries by performing bitwise logical operations on these bitmaps.

Designed for cases where the values of a variable repeat very frequently (e.g., sex field in a customer database).

Non-Clustered

The data is present in arbitrary order, but the logical ordering is specified by the index. The data rows may be spread throughout the table regardless of the value of the indexed column or expression.

In a non-clustered index, The physical order of the rows is not the same as the index order.

Covering index

This index is only used to locate data records in the table and not to return data.

It can dramatically speed up data retrieval but may itself be large due to the additional keys, which slow down data insertion and update. To reduce such index size, some systems allow including non-key fields in the index.

Index concurrency control

A.k.a. index locking

Clustered

alters the data block into a certain distinct order to match the index, resulting in the row data being stored in order

Only one clustered index can be created on a given database table
The ordering of the physical data rows in accordance with the index blocks that point to them

Column order

The order that the index definition defines the columns in is important.

It is possible to retrieve a set of row identifiers using only the first indexed column.
However, it is not possible or efficient (on most databases) to retrieve the set of rows identifiers using the second or greater indexed column.

Data structure for query optimization in databases

Index: a copy of selected columns of data, from a table, that is designed to enable very efficient search

Key: direct link to the original row of data from which it was copied to allow the complete row to be retrieved efficiently
Partial indices: index entries are created only for those records that satisfy some conditional expression or user-defined functions

Standardization

No standard defines how to create indexes, because the ISO SQL Standard does not cover physical aspects.

Policing the database constraints

Indexes are used to police database constraints, such as UNIQUE, EXCLUSION, PRIMARY KEY, AND FOREIGN KEY

An index may be declared as Unique, which creates an implicit constraint on the underlying table.
Database systems usually implicitly create an index on a set of columns declared Primary KEY, and some are capable of using an already-existing index to police this constraint.
Many database systems require that both referencing and referenced sets of columns in a Foreign KEY constraint are indexed.

Dense index

In databases, a dense index is a file with pairs of keys and pointers for every record in the data file. Every key in this file is associated with a particular pointer to a record. In clustered indices with duplicate keys, the dense index points to the first record with that key.

Sparse index

In databases, a sparse index is a file with pairs of keys and pointers for every block in the data file. Every key in this file is associated with a particular pointer to a particular key in the sorted data file, and points to the lowest-bladed search key in each block.

Applications and limitations

Indexes are useful for many applications but come with some limitations.

With a wildcard at the beginning of the search-term, the database software is unable to use the underlying index data structure (in other words, the WHERE-clause is not sargable).
The solution to this problem can be solved through the addition of another index created on reverse(email_address) and a SQL query.

Cluster

When multiple databases and multiple tables are joined, it is called a cluster. The records for the tables sharing the value of a cluster key shall be stored together in the same or nearby data blocks.

The cluster configuration defines the data layout in the tables that are parts of the cluster.

Secondary index

Used to index fields that are neither ordering fields nor key fields (there is no assurance that the file is organized on key field or primary key field)