Explain database indexing and how it improves query performance.

⚙️ Backend Development9/21/2025
Understanding database indexes, types of indexes, how they work, and their impact on query optimization and performance.

Database Indexing

What are Indexes?

Indexes are data structures that improve the speed of data retrieval operations on a database table. They work like a book's index - providing a fast way to find specific information.

How Indexes Work

Without Index

-- Full table scan - O(n) time complexity
SELECT * FROM users WHERE email = 'john@example.com';
  • Database scans every row
  • Time increases linearly with table size
  • Slow for large tables

With Index

-- Index lookup - O(log n) time complexity
CREATE INDEX idx_users_email ON users(email);
SELECT * FROM users WHERE email = 'john@example.com';
  • Database uses index to find data quickly
  • Logarithmic time complexity
  • Much faster for large datasets

Types of Indexes

1. Primary Index (Clustered)

  • Automatically created for primary keys
  • Data is physically stored in index order
  • One per table
  • Fastest for range queries

2. Secondary Index (Non-Clustered)

  • Additional indexes on other columns
  • Points to actual data location
  • Multiple per table
  • Good for equality searches

3. Composite Index

-- Multi-column index
CREATE INDEX idx_users_name_age ON users(last_name, first_name, age);
  • Covers multiple columns
  • Order matters for optimization
  • Good for queries using multiple columns

4. Unique Index

CREATE UNIQUE INDEX idx_users_email ON users(email);
  • Ensures uniqueness
  • Prevents duplicate values
  • Combines constraint with performance

5. Partial Index

CREATE INDEX idx_active_users ON users(created_at) WHERE status = 'active';
  • Indexes only rows meeting condition
  • Smaller index size
  • Faster for filtered queries

Index Data Structures

B-Tree Index (Most Common)

  • Structure: Balanced tree
  • Use Case: Range queries, equality
  • Time Complexity: O(log n)
  • Best For: Ordered data, sorting

Hash Index

  • Structure: Hash table
  • Use Case: Equality queries only
  • Time Complexity: O(1) average
  • Best For: Exact matches

Bitmap Index

  • Structure: Bit arrays
  • Use Case: Low cardinality columns
  • Best For: Data warehousing, analytics

Query Optimization with Indexes

Index Selection Rules

  1. Selectivity: High selectivity (unique values) is better
  2. Query Patterns: Index columns used in WHERE, ORDER BY, JOIN
  3. Composite Order: Most selective column first
  4. Covering Indexes: Include all needed columns

Example Optimization

-- Slow query
SELECT user_id, name, email 
FROM users 
WHERE status = 'active' 
  AND created_at > '2023-01-01'
ORDER BY created_at DESC;

-- Optimal index
CREATE INDEX idx_users_status_created 
ON users(status, created_at DESC);

Index Performance Impact

Benefits

  • Faster SELECT: Dramatic improvement in read performance
  • Faster JOIN: Improves join operation speed
  • Faster ORDER BY: Can eliminate sorting step
  • Unique Constraints: Enforces data integrity

Costs

  • Storage Overhead: Additional disk space required
  • Write Performance: INSERT, UPDATE, DELETE slower
  • Maintenance: Index updates with data changes
  • Memory Usage: Indexes consume RAM

Index Maintenance

Monitoring Index Usage

-- PostgreSQL: Check index usage
SELECT 
    schemaname,
    tablename,
    indexname,
    idx_tup_read,
    idx_tup_fetch
FROM pg_stat_user_indexes;

-- MySQL: Check index cardinality
SHOW INDEX FROM users;

Index Cleanup

-- Remove unused indexes
DROP INDEX idx_users_unused;

-- Rebuild fragmented indexes
REINDEX INDEX idx_users_email;

Best Practices

1. Index Strategy

  • Analyze Queries: Identify slow queries first
  • Measure Impact: Before and after performance
  • Monitor Usage: Remove unused indexes
  • Balance: Read vs write performance

2. Design Guidelines

-- Good: Specific, selective
CREATE INDEX idx_orders_customer_date 
ON orders(customer_id, order_date);

-- Bad: Too broad, low selectivity
CREATE INDEX idx_users_gender ON users(gender);

3. Query Writing

-- Index-friendly query
SELECT * FROM users WHERE email = 'john@example.com';

-- Index-unfriendly query
SELECT * FROM users WHERE UPPER(email) = 'JOHN@EXAMPLE.COM';

Advanced Indexing Concepts

Covering Indexes

-- Include frequently accessed columns
CREATE INDEX idx_users_covering 
ON users(email) 
INCLUDE (name, phone, address);

Expression Indexes

-- Index on computed values
CREATE INDEX idx_users_lower_email 
ON users(LOWER(email));

Index Hints

-- Force specific index usage (use carefully)
SELECT * FROM users USE INDEX (idx_users_email) 
WHERE email = 'john@example.com';

Database-Specific Features

PostgreSQL

  • GIN indexes for full-text search
  • GiST indexes for geometric data
  • BRIN indexes for large tables

MySQL

  • Full-text indexes for search
  • Spatial indexes for geographic data
  • Memory engine for temporary tables

MongoDB

  • Compound indexes
  • Text indexes for search
  • Geospatial indexes
By: System Admin