Databases & Data Storage

Inverted Index

Definition

An inverted index is an index data structure storing a mapping from content, such as words or numbers, to its locations in a database file, or in a document or a set of documents. It is the most popular data structure used in full-text search engines.

Why It Matters

The inverted index is what makes full-text search fast. Instead of scanning every document for a word, the search engine can look up the word in the index and instantly get a list of all the documents that contain it.

Contextual Example

An inverted index for a set of documents would look like a real-world book index. It would have a list of all unique words, and next to each word, a list of the documents and positions where that word appears.

Common Misunderstandings

  • The process of building an inverted index involves tokenizing text, normalizing words (e.g., to lowercase), and removing stop words.
  • It is the core data structure behind search engines like Elasticsearch and Lucene.

Related Terms

Last Updated: December 17, 2025