
In large-scale systems, database performance can make or break the user experience.
When APIs slow down, dashboards lag, or permission checks take too long, the usual instinct is: “Add an index.”
While indexing is crucial, it’s not always enough — especially when dealing with billions of rows. In such cases, partitioning can be a game-changer.
Recently, while working for a client, I significantly improved their system performance by combining PostgreSQL partitioning with targeted indexing. The result? Queries that used to take hundreds of milliseconds now run in a few milliseconds.
🛠 What is Partitioning?
Partitioning is the process of splitting a large table into smaller, more manageable pieces, called partitions.
From the application’s perspective, it still looks like a single table, but under the hood, queries can skip irrelevant partitions — drastically reducing the data scanned.
Types of Partitioning in PostgreSQL
- Range Partitioning
- Rows are stored based on a value range (e.g., date ranges).
- Example: One partition for each month of data.
- Best for time-series data like logs, metrics, or events.
- List Partitioning
- Rows are stored based on discrete values in a column.
- Example: Partition per country or region.
- Best when the column has a small set of distinct values.
- Hash Partitioning
- A hash function distributes rows evenly across partitions.
- Example: Partition by tenant_id for a multi-tenant SaaS system.
- Best for evenly distributing data without natural ranges.
- Composite Partitioning
- Combines two or more methods.
- Example: First partition by range (year), then subpartition by hash (tenant_id).
PostgreSQL supports native partitioning since version 10, and it automatically routes inserts and queries to the correct partition when set up properly.
🔍 What is Indexing?
An index is like a fast lookup table for your database — instead of scanning the whole table, the database can jump directly to the rows that match your query conditions.
Types of common indexes in PostgreSQL:
- B-Tree Indexes → Best for equality and range lookups (most common)
- Hash Indexes → Best for equality checks (less common, more specialized)
- GIN / GiST Indexes → Best for full-text search or JSONB fields
Indexes speed up lookups, but as tables grow into hundreds of millions of rows, even indexes can become large and slow. That’s where partitioning helps: instead of one massive index, you get many small indexes — one per partition.
🧩 Real-World Scenario: Massive Permissions Lookup
My client runs a large authentication and authorization system that manages:
- 10 million users (entities)
- 5 million resources (APIs, files, services)
- Billions of permission mappings
Every API request must check:
“Does this user have permission X for resource Y?”
A simplified schema:
CREATE TABLE permissions (
organization_id INT,
entity_id BIGINT,
resource_id BIGINT,
permission_type TEXT,
granted_at TIMESTAMP
);
🐌 The Problem
Even with multiple indexes:
CREATE INDEX idx_entity_perm
ON permissions (entity_id, permission_type);
CREATE INDEX idx_resource_perm
ON permissions (resource_id, permission_type);
Queries like:
SELECT 1
FROM permissions
WHERE organization_id = 32
AND entity_id = 983233
AND resource_id = 424
AND permission_type = 'read';
…took 150–250 ms under heavy load.
Why?
- The table was too large
- Indexes were too big, making scans expensive
- All queries were scoped to a specific organization (tenant), but the database still had to search across the entire table
⚙️ My Solution: Tenant-Based Hash Partitioning + Targeted Indexing
After profiling with EXPLAIN ANALYZE
, I saw a clear pattern:
- Every query filtered by
organization_id
- Inside that scope, queries filtered by
entity_id
andresource_id
So, I decided to partition by organization_id using hash partitioning.
Step 1: Create Partitioned Table
CREATE TABLE permissions (
organization_id INT,
entity_id BIGINT,
resource_id BIGINT,
permission_type TEXT,
granted_at TIMESTAMP
) PARTITION BY HASH (organization_id);
Step 2: Create Hash Partitions
CREATE TABLE permissions_org_0
PARTITION OF permissions FOR VALUES WITH (MODULUS 8, REMAINDER 0);
CREATE TABLE permissions_org_1
PARTITION OF permissions FOR VALUES WITH (MODULUS 8, REMAINDER 1);
-- ... up to permissions_org_7
This split the data evenly across 8 partitions.
Step 3: Add Indexes Inside Each Partition
CREATE INDEX idx_entity_resource_perm
ON permissions_org_0 (entity_id, resource_id, permission_type);
Step 4: Query Execution
Now, when running:
SELECT 1
FROM permissions
WHERE organization_id = 32
AND entity_id = 983233
AND resource_id = 424
AND permission_type = 'read';
- PostgreSQL immediately routes the query to the right partition
- The small index inside that partition is scanned instead of a massive global index
📈 Performance Boost
Before: 150–250 ms
After: 2–5 ms
This improvement allowed the client to handle more concurrent API requests, reduce DB load, and eliminate permission-check bottlenecks.
📊 Partitioning vs Indexing: When to Use What?
Feature | Indexing | Partitioning |
---|---|---|
Purpose | Speeds up row lookups | Reduces scanned data |
Best for | Selective columns | Always-filtered columns |
Complexity | Low | Medium–High |
Scaling | Slower at huge scale | Scales better with data growth |
Write impact | Minimal | Slight overhead |
Maintenance | Easy | Needs automation |
🤝 Why They Work Best Together
- Partitioning: Skips irrelevant data entirely
- Indexing: Finds the exact row inside the relevant data
In this client project, hash partitioning by organization_id plus indexes on (entity_id, resource_id, permission_type) made the permission checks instant.
⚠️ Things to Watch
- Don’t over-partition (too many partitions slow the planner)
- Pick the right key — it must be in almost every query’s
WHERE
clause - Joins across partitions can be tricky
- Automate partition creation if the key space changes
🧠 Final Takeaway
Think of indexing as a magnifying glass — fast for finding something in a book.
But when the “book” is a warehouse, partitioning organizes shelves so you only search in one section.
For massive multi-tenant systems like authentication/permissions, partitioning + indexing isn’t just an optimization — it’s a necessity for scalability.
Leave a Reply