Most database software includes indexing technology that enables sub-linear time lookup to improve performance, as linear search is inefficient for large databases. Suppose a database contains N data items and one must be retrieved based on the value of one of the fields. A simple implementation retrieves and examines each item according to the test. If there is only one matching item, this can stop when it finds that single item, but if there are multiple matches, it must test everything. An index is any data structure that improves the performance of lookup. There are many different data structures used for this purpose.
There are complex design trade-offs involving lookup performance, index size, and index update performance. Indexes are used to police database constraints, such as UNIQUE, EXCLUSION, PRIMARY KEY and FOREIGN KEY. An index may be declared as UNIQUE, which creates an implicit constraint on the underlying table. Some database systems support an EXCLUSION constraint that ensures that, for a newly inserted or updated record, a certain predicate holds for no other record. An index supporting fast searching for records satisfying the predicate is required to police such a constraint. The data is present in arbitrary order, but the logical ordering is specified by the index.
The data rows may be spread throughout the table regardless of the value of the indexed column or expression. The physical order of the rows is not the same as the index order. The indexed columns are typically non-primary key columns used in JOIN, WHERE, and ORDER BY clauses. There can be more than one non-clustered index on a database table. Clustering alters the data block into a certain distinct order to match the index, resulting in the row data being stored in order.
Therefore, only one clustered index can be created on a given database table. Clustered indices can greatly increase overall speed of retrieval, but usually only where the data is accessed sequentially in the same or reverse order of the clustered index, or when a range of items is selected. Since the physical records are in this sort order on disk, the next row item in the sequence is immediately before or after the last one, and so fewer data block reads are required. The primary feature of a clustered index is therefore the ordering of the physical data rows in accordance with the index blocks that point to them.