Chapter 11 Indexing¶
Abstract
- Basic Concepts
- Ordered Indices
- B+-Tree Index
- B+-Tree File Organization
- B-Tree Index Files
- Indices on Multiple Keys
- Indexing on Flash
- Indexing in Main Memory
- Write Optimized Indices
- Log Structured Merge (LSM) Tree
- Buffer Tree
- Bitmap Indices
Basic Concepts¶
- Indexing mechanisms used to speed up access to desired data.
- E.g., author catalog in library
- Search Key - attribute to set of attributes used to look up records in a file.
- An index file consists of records (called index entries) of the form
search-key | pointer |
---|---|
- Index files are typically much smaller than the original file
- Two basic kinds of indices:
- Ordered indices: search keys are stored in sorted order(顺序索引)
- Hash indices: search keys are distributed uniformly across “buckets” using a “hash function”.(散列索引)
Index Evaluation Metrics(评估指标)¶
- Access types supported efficiently. E.g.,
- Point query: records with a specified value in the attribute
- Range query: or records with an attribute value falling in a specified range of values.
- Access time
- Insertion time
- Deletion time
- Space overhead
Ordered Indices¶
是否聚集索引的区别在于索引顺序是否与文件存储顺序一致
- In an ordered index, index entries are stored sorted on the search key value. E.g., author catalog in library.
- Primary index(主索引): in a sequentially ordered file, the index whose search key specifies the sequential order of the file.
- Also called clustering index(聚集索引)
- The search key of a primary index is usually but not necessarily the primary key.
- Secondary index(辅助索引): an index whose search key specifies an order different from the sequential order of the file. Also called non-clustering index.
- Index-sequential file(索引顺序文件): ordered sequential file with a primary index.
Primary index(主索引)
primary index on ID attribute of instructor
Secondary Indices Example
Secondary index on salary attribute of instructor
Dense Index Files¶
- Dense index(稠密索引) — Index record appears for every search-key value in the file.
- E.g. index on ID attribute of instructor relation
Dense index on dept_name, with instructor file sorted on dept_name
Sparse Index Files¶
- Sparse Index(稀疏索引): contains index records for only some search-key values.
- Applicable when records are sequentially ordered on search-key
- To locate a record with search-key value K we:
- Find index record with largest search-key value < K
- Search file sequentially starting at the record to which the index record points
- Compared to dense indices:
- Less space and less maintenance overhead for insertions and deletions.
- Generally slower than dense index for locating records.
- Good tradeoff: sparse index with an index entry for every block in file, corresponding to least search-key value in the block.