Configure ClickHouse materialized views for real-time analytics with performance optimization

Intermediate 45 min Apr 21, 2026
Ubuntu 24.04 Debian 12 AlmaLinux 9 Rocky Linux 9

Set up ClickHouse materialized views to transform raw data into real-time aggregations. Configure performance optimization with memory tuning and monitoring for high-throughput analytics workloads.

Prerequisites

  • ClickHouse server installed and running
  • Administrative access to ClickHouse configuration
  • Basic understanding of SQL and time-series data

What this solves

ClickHouse materialized views automatically transform incoming data into pre-aggregated results, enabling real-time analytics on massive datasets. This tutorial shows you how to create materialized views for common analytics patterns like time-series aggregations, user behavior analysis, and performance metrics. You'll also optimize memory usage and set up monitoring to handle high-throughput data ingestion.

Prerequisites and ClickHouse setup verification

Verify ClickHouse installation

Check that ClickHouse server is running and accessible. If you need to install ClickHouse first, follow our ClickHouse installation guide.

sudo systemctl status clickhouse-server
clickhouse-client --query "SELECT version()"

Create database for analytics

Set up a dedicated database for your analytics workload with appropriate settings.

clickhouse-client --query "CREATE DATABASE analytics_db"

Configure memory settings

Optimize ClickHouse memory settings for materialized view processing. These settings improve performance for real-time aggregations.



    10000000000
    8000000000
    8000000000
    0
    1000000000000

Restart ClickHouse service

Apply the memory configuration changes by restarting the service.

sudo systemctl restart clickhouse-server
sudo systemctl status clickhouse-server

Create source tables and data ingestion

Create events source table

Set up a source table that receives raw event data. This uses MergeTree engine optimized for time-series data.

clickhouse-client --database=analytics_db --query "
CREATE TABLE events (
    timestamp DateTime64(3),
    user_id UInt32,
    event_type String,
    page_url String,
    session_id String,
    country String,
    device_type String,
    revenue Decimal64(4)
) ENGINE = MergeTree()
PARTITION BY toYYYYMM(timestamp)
ORDER BY (timestamp, user_id)
TTL timestamp + INTERVAL 1 YEAR
SETTINGS index_granularity = 8192
"

Create metrics source table

Set up a table for system metrics and performance data with high ingestion rates.

clickhouse-client --database=analytics_db --query "
CREATE TABLE metrics (
    timestamp DateTime64(3),
    metric_name String,
    metric_value Float64,
    host String,
    service String,
    environment String,
    tags Map(String, String)
) ENGINE = MergeTree()
PARTITION BY toYYYYMM(timestamp)
ORDER BY (timestamp, metric_name, host)
TTL timestamp + INTERVAL 6 MONTH
SETTINGS index_granularity = 8192
"

Insert sample data

Add test data to verify materialized views work correctly with realistic datasets.

clickhouse-client --database=analytics_db --query "
INSERT INTO events VALUES
    ('2024-01-15 10:00:00.000', 1001, 'page_view', '/home', 'sess_001', 'US', 'desktop', 0),
    ('2024-01-15 10:01:30.000', 1001, 'click', '/product/123', 'sess_001', 'US', 'desktop', 0),
    ('2024-01-15 10:02:15.000', 1002, 'page_view', '/home', 'sess_002', 'UK', 'mobile', 0),
    ('2024-01-15 10:03:00.000', 1001, 'purchase', '/checkout', 'sess_001', 'US', 'desktop', 99.99),
    ('2024-01-15 10:04:22.000', 1003, 'page_view', '/about', 'sess_003', 'DE', 'tablet', 0)
"

Insert sample metrics data

Add performance metrics data for system monitoring aggregations.

clickhouse-client --database=analytics_db --query "
INSERT INTO metrics VALUES
    ('2024-01-15 10:00:00.000', 'cpu_usage', 45.2, 'web-01', 'nginx', 'production', {'region': 'us-east-1'}),
    ('2024-01-15 10:01:00.000', 'cpu_usage', 52.8, 'web-01', 'nginx', 'production', {'region': 'us-east-1'}),
    ('2024-01-15 10:01:00.000', 'memory_usage', 78.5, 'web-01', 'nginx', 'production', {'region': 'us-east-1'}),
    ('2024-01-15 10:02:00.000', 'response_time', 145.3, 'web-01', 'nginx', 'production', {'region': 'us-east-1'}),
    ('2024-01-15 10:02:00.000', 'cpu_usage', 38.1, 'db-01', 'clickhouse', 'production', {'region': 'us-east-1'})
"

Configure materialized views for real-time aggregations

Create hourly events aggregation

Set up a materialized view that aggregates events by hour, country, and device type for real-time dashboard queries.

clickhouse-client --database=analytics_db --query "
CREATE TABLE events_hourly (
    hour DateTime,
    country String,
    device_type String,
    total_events UInt64,
    unique_users UInt64,
    page_views UInt64,
    purchases UInt64,
    total_revenue Decimal64(4)
) ENGINE = SummingMergeTree()
PARTITION BY toYYYYMM(hour)
ORDER BY (hour, country, device_type)
TTL hour + INTERVAL 2 YEAR
"

Create materialized view for hourly aggregation

Configure the materialized view that automatically processes incoming events data.

clickhouse-client --database=analytics_db --query "
CREATE MATERIALIZED VIEW events_hourly_mv TO events_hourly AS
SELECT
    toStartOfHour(timestamp) as hour,
    country,
    device_type,
    count() as total_events,
    uniq(user_id) as unique_users,
    countIf(event_type = 'page_view') as page_views,
    countIf(event_type = 'purchase') as purchases,
    sum(revenue) as total_revenue
FROM events
GROUP BY hour, country, device_type
"

Create user session analysis table

Build aggregations for user behavior analysis with session-level metrics.

clickhouse-client --database=analytics_db --query "
CREATE TABLE user_sessions (
    date Date,
    user_id UInt32,
    country String,
    device_type String,
    session_count UInt32,
    total_events UInt32,
    session_duration UInt32,
    total_revenue Decimal64(4)
) ENGINE = ReplacingMergeTree()
PARTITION BY toYYYYMM(date)
ORDER BY (date, user_id)
TTL date + INTERVAL 1 YEAR
"

Create user session materialized view

Process events into user session summaries for behavioral analytics.

clickhouse-client --database=analytics_db --query "
CREATE MATERIALIZED VIEW user_sessions_mv TO user_sessions AS
SELECT
    toDate(timestamp) as date,
    user_id,
    any(country) as country,
    any(device_type) as device_type,
    uniq(session_id) as session_count,
    count() as total_events,
    toUInt32(max(timestamp) - min(timestamp)) as session_duration,
    sum(revenue) as total_revenue
FROM events
GROUP BY date, user_id
"

Create metrics aggregation table

Set up real-time metrics aggregation for system monitoring dashboards.

clickhouse-client --database=analytics_db --query "
CREATE TABLE metrics_5min (
    time_bucket DateTime,
    metric_name String,
    host String,
    service String,
    environment String,
    avg_value Float64,
    max_value Float64,
    min_value Float64,
    count_values UInt64
) ENGINE = SummingMergeTree()
PARTITION BY toYYYYMM(time_bucket)
ORDER BY (time_bucket, metric_name, host, service)
TTL time_bucket + INTERVAL 3 MONTH
"

Create metrics materialized view

Aggregate metrics data into 5-minute buckets for real-time monitoring.

clickhouse-client --database=analytics_db --query "
CREATE MATERIALIZED VIEW metrics_5min_mv TO metrics_5min AS
SELECT
    toStartOfFiveMinute(timestamp) as time_bucket,
    metric_name,
    host,
    service,
    environment,
    avg(metric_value) as avg_value,
    max(metric_value) as max_value,
    min(metric_value) as min_value,
    count() as count_values
FROM metrics
GROUP BY time_bucket, metric_name, host, service, environment
"

Optimize materialized view performance and monitoring

Configure async insert settings

Enable asynchronous inserts to improve performance with high-volume data ingestion.



    1
    0
    200
    10485760
    16
    0

Optimize merge settings for materialized views

Configure merge behavior to handle frequent small inserts from materialized views efficiently.



    
        3000
        1000
        10000
        161061273600
        5000
        1000
    

Create monitoring queries

Set up queries to monitor materialized view performance and data freshness.

clickhouse-client --database=analytics_db --query "
CREATE TABLE mv_monitoring (
    timestamp DateTime DEFAULT now(),
    view_name String,
    source_table String,
    target_table String,
    rows_processed UInt64,
    last_update DateTime,
    processing_time_ms UInt64
) ENGINE = MergeTree()
ORDER BY (timestamp, view_name)
TTL timestamp + INTERVAL 30 DAY
"

Create performance monitoring view

Monitor materialized view execution statistics and identify performance bottlenecks.

clickhouse-client --database=analytics_db --query "
CREATE VIEW mv_performance_stats AS
SELECT
    database,
    table as view_name,
    formatReadableSize(total_bytes) as data_size,
    total_rows,
    formatReadableSize(total_bytes_uncompressed) as uncompressed_size,
    round(total_bytes / total_bytes_uncompressed, 3) as compression_ratio,
    partition_key,
    sorting_key
FROM system.tables
WHERE database = 'analytics_db'
AND engine LIKE '%MergeTree'
ORDER BY total_bytes DESC
"

Restart ClickHouse with optimized settings

Apply all performance optimizations by restarting the service.

sudo systemctl restart clickhouse-server
sudo systemctl status clickhouse-server

Test materialized views with bulk data

Insert additional data to verify materialized views process records correctly.

clickhouse-client --database=analytics_db --query "
INSERT INTO events SELECT
    now() - INTERVAL number MINUTE as timestamp,
    1000 + (number % 100) as user_id,
    ['page_view', 'click', 'purchase'][1 + (number % 3)] as event_type,
    '/page/' || toString(number % 20) as page_url,
    'session_' || toString(number % 50) as session_id,
    ['US', 'UK', 'DE', 'FR', 'JP'][1 + (number % 5)] as country,
    ['desktop', 'mobile', 'tablet'][1 + (number % 3)] as device_type,
    if(number % 10 = 0, 50.0 + (number % 100), 0) as revenue
FROM numbers(10000)
"

Verify your setup

Check that materialized views are processing data correctly and aggregations are updating in real-time.

# Check materialized view status
clickhouse-client --database=analytics_db --query "SHOW TABLES LIKE '%_mv'"

Verify hourly aggregations

clickhouse-client --database=analytics_db --query " SELECT hour, country, device_type, total_events, unique_users, total_revenue FROM events_hourly ORDER BY hour DESC, total_events DESC LIMIT 10"

Check user session data

clickhouse-client --database=analytics_db --query " SELECT date, count() as active_users, sum(total_revenue) as daily_revenue FROM user_sessions GROUP BY date ORDER BY date DESC LIMIT 5"

Monitor metrics aggregation

clickhouse-client --database=analytics_db --query " SELECT time_bucket, metric_name, host, avg_value, max_value FROM metrics_5min ORDER BY time_bucket DESC LIMIT 10"

Check materialized view performance

clickhouse-client --database=analytics_db --query "SELECT * FROM mv_performance_stats"
Pro tip: Use SYSTEM FLUSH LOGS to ensure all query statistics are written to system tables for monitoring.

Common issues

SymptomCauseFix
Materialized view not updating View creation failed or syntax error SHOW CREATE TABLE view_name and check system.query_log
High memory usage during aggregation Insufficient memory for GROUP BY operations Increase max_memory_usage and max_bytes_before_external_group_by
Slow materialized view processing Too many small parts or inefficient ORDER BY Optimize merge settings and review table structure
Data duplication in SummingMergeTree Non-optimal ORDER BY clause Ensure ORDER BY includes all dimension columns
Materialized view consuming too much disk Missing TTL or compression settings Add TTL settings and enable compression

Next steps

Running this in production?

Want this handled for you? Setting up ClickHouse materialized views is straightforward. Keeping them optimized, monitored, and scaling efficiently across environments is the harder part. See how we run infrastructure like this for European teams processing billions of events daily.

Automated install script

Run this to automate the entire setup

Need help?

Don't want to manage this yourself?

We handle high availability infrastructure for businesses that depend on uptime. From initial setup to ongoing operations.