Wide-Column & Time-Series Models
TL;DR
Wide-column stores like Cassandra organize data by partition key and clustering key, making them ideal for massive write throughput and time-ordered data. Time-series databases like TimescaleDB add specialized compression and retention on top of familiar SQL. Choose these when your workload is append-heavy and your queries follow predictable patterns.
Think of It Like a Filing Cabinet With a Twist
Imagine a filing cabinet where each drawer is labeled with a customer name (the partition key), and inside each drawer, documents are sorted by date (the clustering key). You can quickly pull all documents for one customer in date order — but asking "show me all documents from March across all customers" means opening every single drawer.
That's the wide-column model. It's blazing fast for the queries it's designed for, and painful for everything else. You design the schema around your queries, not your entities.
Cassandra — The Wide-Column Workhorse
Cassandra is the most widely-used wide-column store. Its data model revolves around two concepts:
Partition key: Determines which node stores the data. All rows with the same partition key live on the same node.
Clustering key: Determines the sort order within a partition. Rows with the same partition key are sorted by clustering key on disk.
CREATE TABLE messages (
channel_id UUID,
message_id TIMEUUID,
author_id UUID,
content TEXT,
created_at TIMESTAMP,
PRIMARY KEY (channel_id, message_id)
) WITH CLUSTERING ORDER BY (message_id DESC);
Here, channel_id is the partition key and message_id (a time-based UUID) is the clustering key. This means:
- All messages for one channel are stored together on the same node
- Within a channel, messages are sorted by time (newest first)
- "Get the last 50 messages in channel X" is a single sequential read — extremely fast
Query-First Design
In relational databases, you model entities first and figure out queries later. In Cassandra, you do the opposite: start with your queries, then design tables to serve them.
Need messages by channel? One table. Need messages by user? A second table with the same data, partitioned differently:
-- Optimized for "messages in a channel"
CREATE TABLE messages_by_channel (
channel_id UUID,
message_id TIMEUUID,
content TEXT,
PRIMARY KEY (channel_id, message_id)
);
-- Optimized for "messages by a user"
CREATE TABLE messages_by_user (
user_id UUID,
message_id TIMEUUID,
content TEXT,
PRIMARY KEY (user_id, message_id)
);
Yes, you store the data twice. That's the trade-off: storage duplication for query performance. There are no joins in Cassandra, so every query pattern needs its own table.
Time-Series Bucketing
For time-series data (metrics, logs, IoT sensors), unbounded partitions are dangerous. If a sensor sends data every second for a year, that's 31 million rows in one partition — too large.
The solution is bucketing: add a time component to the partition key.
CREATE TABLE sensor_readings (
sensor_id UUID,
day DATE,
reading_time TIMESTAMP,
value DOUBLE,
PRIMARY KEY ((sensor_id, day), reading_time)
);
Now each partition holds one day's data for one sensor. Predictable size, easy to query ("readings for sensor X on March 15"), and old data can be dropped by TTL or partition deletion.
Discord's Database Journey
Discord's migration path is one of the best real-world case studies for wide-column databases:
| Year | Database | Scale | Problem |
|---|---|---|---|
| 2015 | MongoDB | Small | Worked fine at low scale |
| 2017 | Cassandra | Billions of messages, 12 nodes | Needed write throughput + time-ordered reads |
| 2022 | ScyllaDB | Trillions of messages, 177→72 nodes | Cassandra's Java GC pauses caused unpredictable latency |
Why Cassandra worked initially: Messages are append-only (write-heavy), access patterns are predictable (messages by channel, sorted by time), and Cassandra's partition model maps perfectly to (channel_id, time_bucket).
Why they moved to ScyllaDB: At 177 nodes, Cassandra's Java garbage collector caused multi-second pauses. Hot partitions (popular channels) created cascading latency. ScyllaDB is a C++ rewrite of Cassandra — same data model, same query language (CQL), but no garbage collector. Their p99 read latency dropped from 40-125ms to 15ms.
The key lesson isn't "use ScyllaDB" — it's that the wide-column data model was right for their access pattern. The implementation changed, the model stayed.
HBase and Bigtable
Google Bigtable inspired the entire wide-column category. It powers Google's internal systems — search indexing, Maps, Gmail, Analytics. Not available as a standalone product, but Cloud Bigtable is the managed version.
HBase is the open-source Bigtable clone, running on top of HDFS (Hadoop's file system). It's common in analytics and batch-processing pipelines where you need to scan large ranges of data.
Both use the same model: row key → column families → columns → timestamped values. The differences are operational: Bigtable is managed by Google, HBase requires significant operational expertise.
TimescaleDB — Time-Series on PostgreSQL
What if you want time-series performance but don't want to learn a new database?
TimescaleDB is a PostgreSQL extension that adds time-series superpowers:
- Hypertables: Automatic time-based partitioning. You create a regular table, call
create_hypertable(), and TimescaleDB handles partition management. - Compression: 10-20x compression on older data. Recent data stays uncompressed for fast writes.
- Continuous aggregates: Materialized views that automatically update as new data arrives. Pre-compute hourly/daily rollups without batch jobs.
- Retention policies: Automatically drop data older than N days.
-- Create a regular table
CREATE TABLE metrics (
time TIMESTAMPTZ NOT NULL,
sensor_id INTEGER,
temperature DOUBLE PRECISION,
humidity DOUBLE PRECISION
);
-- Convert to hypertable (automatic time partitioning)
SELECT create_hypertable('metrics', 'time');
-- Query like normal PostgreSQL
SELECT time_bucket('1 hour', time) AS hour,
AVG(temperature) AS avg_temp
FROM metrics
WHERE sensor_id = 42
AND time > NOW() - INTERVAL '7 days'
GROUP BY hour
ORDER BY hour;
The beauty: you get time-series performance with full SQL, joins, transactions, and your existing PostgreSQL tooling.
When to Choose Wide-Column or Time-Series
| Signal | Database | Why |
|---|---|---|
| Massive write throughput (millions/sec) | Cassandra / ScyllaDB | LSM tree storage, distributed writes |
| Time-ordered data with known access patterns | Cassandra / ScyllaDB | Partition + clustering key model |
| IoT sensor data or metrics | TimescaleDB | Hypertables, compression, retention |
| Need joins or transactions alongside time-series | TimescaleDB | It's PostgreSQL underneath |
| Analytics on historical data | HBase / Bigtable | Large range scans, batch processing |
When NOT to Choose These
- Ad-hoc queries: "Show me all messages containing the word 'bug' across all channels" — Cassandra can't do this efficiently. You need a search engine alongside it.
- Joins: Wide-column stores don't support joins. Period. Design around it or use a different database.
- Transactions across partitions: Cassandra offers lightweight transactions (Paxos-based) for single-partition operations, but nothing like ACID across partitions.
- Small datasets: If your data fits on one machine, PostgreSQL with proper indexing will outperform Cassandra on almost every query pattern with far less operational complexity.
Interview Tip
Only bring up Cassandra or a wide-column store when the problem explicitly involves massive write throughput or time-series data (messaging, metrics, activity feeds). If you mention Cassandra for a simple CRUD app, it signals you're choosing buzzwords over pragmatism.
Quick Recap
| Database | Model | Best For | Avoid When |
|---|---|---|---|
| Cassandra | Partition key + clustering key | High write throughput, time-ordered data | Ad-hoc queries, joins, small data |
| ScyllaDB | Same as Cassandra (C++ rewrite) | Same, but need lower latency | Same limitations |
| TimescaleDB | PostgreSQL + time partitioning | Time-series with SQL needs | Non-time-series workloads |
| HBase/Bigtable | Row key + column families | Analytics, batch processing | Real-time serving |