跳转至

Chapter 11 Indexing

Abstract

  • Basic Concepts
  • Ordered Indices
  • B+-Tree Index
  • B+-Tree File Organization
  • B-Tree Index Files
  • Indices on Multiple Keys
  • Indexing on Flash
  • Indexing in Main Memory
  • Write Optimized Indices
    • Log Structured Merge (LSM) Tree
    • Buffer Tree
  • Bitmap Indices

Basic Concepts

  • Indexing mechanisms used to speed up access to desired data.
    • E.g., author catalog in library
  • Search Key - attribute to set of attributes used to look up records in a file.
  • An index file consists of records (called index entries) of the form
search-key pointer
  • Index files are typically much smaller than the original file
  • Two basic kinds of indices:
    • Ordered indices: search keys are stored in sorted order(顺序索引)
    • Hash indices: search keys are distributed uniformly across “buckets” using a “hash function”.(散列索引)

Index Evaluation Metrics(评估指标)

  • Access types supported efficiently. E.g.,
    • Point query: records with a specified value in the attribute
    • Range query: or records with an attribute value falling in a specified range of values.
  • Access time
  • Insertion time
  • Deletion time
  • Space overhead

Ordered Indices

是否聚集索引的区别在于索引顺序是否与文件存储顺序一致

  • In an ordered index, index entries are stored sorted on the search key value. E.g., author catalog in library.
  • Primary index(主索引): in a sequentially ordered file, the index whose search key specifies the sequential order of the file.
    • Also called clustering index(聚集索引)
    • The search key of a primary index is usually but not necessarily the primary key.
  • Secondary index(辅助索引): an index whose search key specifies an order different from the sequential order of the file. Also called non-clustering index.
  • Index-sequential file(索引顺序文件): ordered sequential file with a primary index.

Primary index(主索引)

primary index on ID attribute of instructor

Secondary Indices Example

Secondary index on salary attribute of instructor

Dense Index Files

  • Dense index(稠密索引) — Index record appears for every search-key value in the file.
  • E.g. index on ID attribute of instructor relation

Dense index on dept_name, with instructor file sorted on dept_name

Sparse Index Files

  • Sparse Index(稀疏索引): contains index records for only some search-key values.
    • Applicable when records are sequentially ordered on search-key
  • To locate a record with search-key value K we:
    • Find index record with largest search-key value < K
    • Search file sequentially starting at the record to which the index record points

  • Compared to dense indices:
    • Less space and less maintenance overhead for insertions and deletions.
    • Generally slower than dense index for locating records.
  • Good tradeoff: sparse index with an index entry for every block in file, corresponding to least search-key value in the block.

Multilevel Index(多级索引)

评论