Index Types

Different index types are suitable for different query patterns and data types. Understanding the principles of each index type helps in choosing the best solution.

B-Tree Index

B-Tree (Balanced Tree) is the most commonly used index type, supporting equality and range queries.

Structure

                        ┌─────────────────────┐
                        │    [50]             │  Root
                        └─────────┬───────────┘
                    ┌─────────────┴─────────────┐
                    ▼                           ▼
          ┌─────────────┐             ┌─────────────┐
          │  [20, 35]   │             │  [70, 85]   │  Internal
          └──┬────┬────┬┘             └──┬────┬────┬┘
             │    │    │                 │    │    │
      ┌──────┘    │    └────┐     ┌─────┘    │    └─────┐
      ▼           ▼         ▼     ▼          ▼          ▼
┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐
│5,10,15  │→│22,25,30 │→│38,42,48 │→│55,60,65 │→│72,78,82 │→│88,90,95 │  Leaf
└─────────┘ └─────────┘ └─────────┘ └─────────┘ └─────────┘ └─────────┘
     ↓           ↓           ↓           ↓           ↓           ↓
 Row Pointers (pointing to actual data rows)

Supported Operations

-- B-Tree can accelerate the following queries:
WHERE column = value           -- Equality query
WHERE column > value           -- Greater than
WHERE column < value           -- Less than
WHERE column >= value          -- Greater than or equal
WHERE column <= value          -- Less than or equal
WHERE column BETWEEN a AND b   -- Range query
WHERE column LIKE 'prefix%'    -- Prefix matching
ORDER BY column                -- Sorting

Creation Examples

-- PostgreSQL / MySQL
CREATE INDEX idx_users_email ON users(email);
CREATE INDEX idx_orders_date ON orders(created_at DESC);

-- Unique index
CREATE UNIQUE INDEX idx_users_email_unique ON users(email);

Hash Index

Hash indexes use a hash function to map keys to buckets, only supporting equality queries.

Structure

Hash Function: hash(key) → bucket_number

Bucket 0: [key1 → row_ptr] → [key5 → row_ptr]
Bucket 1: [key2 → row_ptr]
Bucket 2: [key3 → row_ptr] → [key7 → row_ptr] → [key9 → row_ptr]
Bucket 3: [key4 → row_ptr] → [key6 → row_ptr]
...

Characteristics

Feature	B-Tree	Hash
Equality query	O(log n)	O(1)
Range query	Supported	Not supported
Sorting	Supported	Not supported
Prefix matching	Supported	Not supported
Storage overhead	Larger	Smaller

Use Cases

-- PostgreSQL Hash index (suitable for high-cardinality equality queries)
CREATE INDEX idx_sessions_token ON sessions USING HASH (token);

-- Good for: WHERE token = 'abc123'
-- Not for: WHERE token > 'abc' or ORDER BY token

Note: MySQL InnoDB does not support Hash indexes (Memory engine does).

B+ Tree vs B-Tree

Most databases actually use B+ Tree:

B+ Tree Characteristics:
┌──────────────────────────────────────────────────────────┐
│ 1. All data stored in leaf nodes                         │
│ 2. Leaf nodes form doubly linked list for range queries  │
│ 3. Non-leaf nodes only store keys, allowing more entries │
└──────────────────────────────────────────────────────────┘

             [30, 60]            ← Internal nodes only store keys
            /    |    \
       [10,20] [40,50] [70,80]   ← Internal nodes
          │       │       │
    ┌─────┴───────┴───────┴─────┐
    ▼                           ▼
[10→data]↔[20→data]↔[30→data]↔[40→data]↔...  ← Leaf nodes form linked list

Full-Text Index

Full-text indexes are used for efficient text content searching.

PostgreSQL Full-Text Search

-- Create GIN full-text index
CREATE INDEX idx_articles_content ON articles
USING GIN (to_tsvector('english', title || ' ' || content));

-- Query
SELECT * FROM articles
WHERE to_tsvector('english', title || ' ' || content) @@ to_tsquery('database & optimization');

-- Query with ranking
SELECT
    title,
    ts_rank(to_tsvector('english', content), query) as rank
FROM articles, to_tsquery('database | postgresql') query
WHERE to_tsvector('english', content) @@ query
ORDER BY rank DESC;

MySQL Full-Text Search

-- Create full-text index
CREATE FULLTEXT INDEX idx_articles_fulltext ON articles(title, content);

-- Natural language search
SELECT * FROM articles
WHERE MATCH(title, content) AGAINST('database optimization' IN NATURAL LANGUAGE MODE);

-- Boolean mode search
SELECT * FROM articles
WHERE MATCH(title, content) AGAINST('+database -mysql' IN BOOLEAN MODE);

GIN Index (PostgreSQL)

GIN (Generalized Inverted Index) is suitable for columns containing multiple values.

Supported Data Types

-- Arrays
CREATE INDEX idx_posts_tags ON posts USING GIN (tags);
SELECT * FROM posts WHERE tags @> ARRAY['database', 'postgresql'];

-- JSONB
CREATE INDEX idx_users_metadata ON users USING GIN (metadata);
SELECT * FROM users WHERE metadata @> '{"role": "admin"}';

-- Full-text search
CREATE INDEX idx_docs_search ON documents USING GIN (to_tsvector('english', content));

GIN vs GiST

Feature	GIN	GiST
Build speed	Slow	Fast
Query speed	Fast	Slower
Update speed	Slow	Fast
Index size	Large	Small
Best for	Read-heavy	Frequent updates

GiST Index (PostgreSQL)

GiST (Generalized Search Tree) supports various advanced data types.

-- Geospatial data (PostGIS)
CREATE INDEX idx_locations_geom ON locations USING GIST (geom);
SELECT * FROM locations
WHERE ST_DWithin(geom, ST_MakePoint(-73.9857, 40.7484)::geography, 1000);

-- Range types
CREATE INDEX idx_events_duration ON events USING GIST (duration);
SELECT * FROM events WHERE duration && '[2024-01-01, 2024-01-31]'::daterange;

-- Exclusion constraint (prevent overlapping)
ALTER TABLE reservations
ADD CONSTRAINT no_overlapping_reservations
EXCLUDE USING GIST (room_id WITH =, time_range WITH &&);

BRIN Index (PostgreSQL)

BRIN (Block Range Index) is suitable for large tables with naturally ordered data.

-- Time series data (log tables)
CREATE INDEX idx_logs_created ON logs USING BRIN (created_at);

-- Characteristics:
-- - Very small index (possibly only 1% of B-Tree size)
-- - Suitable for data stored in physical order
-- - May scan more data blocks during queries

BRIN vs B-Tree Comparison

Table: logs (100 million rows, ordered by created_at)

B-Tree Index:
├── Size: ~2GB
├── Query time: ~1ms
└── Best for: Small range precise queries

BRIN Index:
├── Size: ~20MB
├── Query time: ~50ms
└── Best for: Large range scans, storage-constrained scenarios

Index Selection Guide

Query Type	Recommended Index Type
Equality query (=)	B-Tree, Hash
Range query (BETWEEN, greater/less than)	B-Tree
Prefix matching (LIKE 'abc%')	B-Tree
Full-text search	GIN (Full-Text)
Array containment	GIN
JSONB queries	GIN
Geospatial	GiST
Range overlap	GiST
Large tables with ordered data	BRIN

Index Types

On this page