Materialized Views in Cassandra: A Complete Guide

10/12/2025

All Articles

Materialized View architecture in Cassandra showing synchronization between base table and view

Materialized Views in Cassandra: A Complete Guide

In Apache Cassandra, Materialized Views (MVs) provide a powerful mechanism for automatically maintaining alternate query patterns without duplicating data manually. They simplify querying by creating a new table that automatically stays in sync with the base table.

Introduced in Cassandra 3.x, Materialized Views help reduce complexity in application logic by letting you query the same data using different primary key configurations.

1. What Is a Materialized View?

A Materialized View in Cassandra is a predefined query result that is automatically updated as the base table changes. Unlike a traditional view in relational databases, Cassandra’s MVs physically store data, improving query speed at the cost of additional storage and write overhead.

Example Use Case:

Suppose you have a users table that stores user details:

CREATE TABLE users (
    user_id UUID PRIMARY KEY,
    name text,
    email text,
    city text
);

By default, you can only query by user_id. If you also want to query users by email, you can create a Materialized View instead of creating a separate denormalized table.

2. How Materialized Views Work

When you create a Materialized View, Cassandra:

Creates a new table based on a different primary key.
Automatically propagates updates, inserts, and deletes from the base table.
Ensures that the view stays consistent with the base table.

Each node responsible for the base table data also manages the corresponding view data to ensure consistency.

3. Creating a Materialized View

Syntax:

CREATE MATERIALIZED VIEW keyspace_name.view_name AS
SELECT <columns>
FROM keyspace_name.base_table
WHERE <column> IS NOT NULL
PRIMARY KEY (<new_primary_key>);

Example:

CREATE MATERIALIZED VIEW users_by_email AS
SELECT user_id, name, email, city
FROM users
WHERE email IS NOT NULL
PRIMARY KEY (email, user_id);

Explanation:

The WHERE clause ensures the indexed column (email) is not null.
The PRIMARY KEY defines how the data is partitioned and clustered in the view.

Now you can query the users_by_email view directly:

SELECT * FROM users_by_email WHERE email = 'john@example.com';

4. Updating and Deleting Data in Materialized Views

All data changes in the base table are automatically reflected in the Materialized View.

Example:

UPDATE users SET city = 'Mumbai' WHERE user_id = 1a2b3c4d;

This update will also modify the users_by_email view automatically.

Similarly, if you delete a row:

DELETE FROM users WHERE user_id = 1a2b3c4d;

The corresponding entry in the view is also removed.

5. Benefits of Materialized Views

Benefit	Description
Simplified query logic	No need to manually maintain multiple denormalized tables.
Automatic synchronization	Base table and view stay consistent automatically.
Improved query performance	Enables fast lookups using alternate keys.
Reduced development complexity	Eliminates the need for maintaining synchronization logic at the application level.

6. Limitations of Materialized Views

Despite their convenience, Materialized Views come with certain limitations.

Limitation	Explanation
Eventual consistency	Updates may take time to propagate; strong consistency isn’t guaranteed.
Write amplification	Each write to the base table triggers writes to associated views.
Schema restrictions	Cannot create a view with arbitrary WHERE clauses or aggregation.
Performance overhead	Can increase disk I/O and storage usage.
Limited support for complex queries	Views don’t support filtering beyond their defined key.

7. Managing Materialized Views

List All Views:

DESCRIBE MATERIALIZED VIEWS;

Drop a Materialized View:

DROP MATERIALIZED VIEW users_by_email;

Check View Definition:

DESCRIBE MATERIALIZED VIEW users_by_email;

8. Best Practices for Using Materialized Views

✅ Recommended:

Use Materialized Views for simple lookups (e.g., querying by a secondary key).
Always include non-null constraints in the WHERE clause.
Ensure that the base table’s partition key is part of the view’s primary key.
Monitor performance using nodetool cfstats.

🚫 Avoid:

Using Materialized Views for high write throughput workloads.
Creating multiple views per base table.
Using views for range queries or aggregations.

9. Example: Orders by Customer View

Base Table:

CREATE TABLE orders (
    order_id UUID PRIMARY KEY,
    customer_id UUID,
    order_date timestamp,
    total_amount double
);

Materialized View:

CREATE MATERIALIZED VIEW orders_by_customer AS
SELECT order_id, customer_id, order_date, total_amount
FROM orders
WHERE customer_id IS NOT NULL AND order_id IS NOT NULL
PRIMARY KEY (customer_id, order_date, order_id)
WITH CLUSTERING ORDER BY (order_date DESC);

Query Example:

SELECT * FROM orders_by_customer WHERE customer_id = 12345 ORDER BY order_date DESC;

This query retrieves all orders placed by a specific customer in descending order of date.

10. Performance and Monitoring Tips

Limit Materialized Views to low-to-medium write workloads.
Use consistent hashing for even data distribution.

Rebuild views if inconsistencies occur using:

nodetool rebuild_materialized_view <keyspace> <view_name>

Periodically check view lag or stale data issues in high-throughput systems.

Conclusion

Materialized Views in Cassandra simplify data modeling by enabling alternate query access paths without complex denormalization. However, they should be used with caution — particularly in high-write or latency-sensitive environments.

By understanding their benefits, limitations, and proper design patterns, you can leverage Materialized Views effectively to optimize your Cassandra-based applications.