Configure TimescaleDB automated data retention policies for efficient storage management

Intermediate 25 min Apr 22, 2026 11 views
Ubuntu 24.04 Debian 12 AlmaLinux 9 Rocky Linux 9

Set up automated data retention and compression policies in TimescaleDB to optimize storage usage and maintain database performance. Learn to configure drop_chunks and compression policies with monitoring.

Prerequisites

  • PostgreSQL 16+ with TimescaleDB extension
  • Database with appropriate user permissions
  • Basic understanding of SQL and time-series data concepts

What this solves

TimescaleDB generates massive amounts of time-series data that grows continuously. Without proper retention policies, your database storage fills up, performance degrades, and costs spiral out of control. Automated retention policies with drop_chunks remove old data and compression policies reduce storage footprint while maintaining query performance.

Step-by-step configuration

Install TimescaleDB extension

First, verify TimescaleDB is installed and create the extension in your PostgreSQL database.

sudo apt update
sudo apt install -y timescaledb-2-postgresql-16
sudo dnf update -y
sudo dnf install -y timescaledb_16

Connect to your PostgreSQL database and create the TimescaleDB extension.

sudo -u postgres psql -d your_database
CREATE EXTENSION IF NOT EXISTS timescaledb;

Create a hypertable for testing

Set up a sample hypertable to demonstrate retention policies. This creates a partitioned table optimized for time-series data.

sudo -u postgres psql -d your_database -c "
CREATE TABLE sensor_data (
    time TIMESTAMPTZ NOT NULL,
    sensor_id INTEGER,
    temperature DOUBLE PRECISION,
    humidity DOUBLE PRECISION
);

SELECT create_hypertable('sensor_data', 'time', chunk_time_interval => INTERVAL '1 day');
"

Insert sample data across multiple time periods

Generate test data spanning several months to demonstrate retention policy effectiveness.

sudo -u postgres psql -d your_database -c "
INSERT INTO sensor_data
SELECT 
    generate_series(
        NOW() - INTERVAL '6 months',
        NOW(),
        INTERVAL '1 hour'
    ) as time,
    (random() * 100)::INTEGER as sensor_id,
    (random() * 40 + 10) as temperature,
    (random() * 100) as humidity;
"

Configure automated data retention policy

Create a retention policy that automatically drops chunks older than 90 days. This runs daily and removes old data permanently.

sudo -u postgres psql -d your_database -c "
SELECT add_retention_policy('sensor_data', INTERVAL '90 days');
"

For more granular control, specify the job schedule and timezone.

sudo -u postgres psql -d your_database -c "
SELECT add_retention_policy(
    'sensor_data', 
    INTERVAL '90 days',
    schedule_interval => INTERVAL '1 day',
    timezone => 'UTC'
);
"

Set up compression policies for storage optimization

Enable compression on chunks older than 7 days to reduce storage usage while maintaining read performance.

sudo -u postgres psql -d your_database -c "
ALTER TABLE sensor_data SET (
    timescaledb.compress,
    timescaledb.compress_segmentby = 'sensor_id',
    timescaledb.compress_orderby = 'time'
);
"

Add the compression policy to automatically compress old chunks.

sudo -u postgres psql -d your_database -c "
SELECT add_compression_policy('sensor_data', INTERVAL '7 days');
"

Configure different retention policies for multiple tables

Set up varying retention periods based on data importance and compliance requirements.

sudo -u postgres psql -d your_database -c "
-- Critical metrics: keep for 2 years
SELECT add_retention_policy('critical_metrics', INTERVAL '2 years');

-- Application logs: keep for 30 days
SELECT add_retention_policy('app_logs', INTERVAL '30 days');

-- Debug data: keep for 7 days
SELECT add_retention_policy('debug_traces', INTERVAL '7 days');
"

Create custom retention conditions

Implement conditional retention based on data values, not just time. This keeps important data longer while removing low-value records.

sudo -u postgres psql -d your_database -c "
-- Custom function for conditional retention
CREATE OR REPLACE FUNCTION custom_retention_filter()
RETURNS VOID AS \$\$
BEGIN
    -- Delete error-free logs older than 30 days
    DELETE FROM app_logs 
    WHERE time < NOW() - INTERVAL '30 days' 
    AND log_level != 'ERROR';
    
    -- Keep error logs for 90 days
    DELETE FROM app_logs 
    WHERE time < NOW() - INTERVAL '90 days' 
    AND log_level = 'ERROR';
END;
\$\$ LANGUAGE plpgsql;
"

Schedule custom retention with pg_cron

Install pg_cron extension for advanced scheduling capabilities beyond TimescaleDB's basic policies.

sudo apt install -y postgresql-16-cron
sudo dnf install -y pg_cron_16

Add pg_cron to your PostgreSQL configuration.

shared_preload_libraries = 'timescaledb,pg_cron'
cron.database_name = 'your_database'

Restart PostgreSQL and create the extension.

sudo systemctl restart postgresql
sudo -u postgres psql -d your_database -c "CREATE EXTENSION pg_cron;"

Schedule your custom retention function to run daily at 2 AM.

sudo -u postgres psql -d your_database -c "
SELECT cron.schedule('custom-retention', '0 2   *', 'SELECT custom_retention_filter();');
"

Monitor retention policy performance

Create views to track retention policy execution and storage savings.

sudo -u postgres psql -d your_database -c "
-- View retention policy jobs
CREATE VIEW retention_jobs AS
SELECT 
    job_id,
    application_name,
    schedule_interval,
    max_runtime,
    max_retries,
    retry_period,
    proc_schema,
    proc_name,
    owner,
    scheduled,
    fixed_schedule,
    initial_start,
    hypertable_schema,
    hypertable_name,
    config
FROM timescaledb_information.jobs
WHERE proc_name = 'policy_retention';

-- View compression statistics
CREATE VIEW compression_stats AS
SELECT 
    schemaname,
    tablename,
    attname,
    n_distinct,
    correlation,
    most_common_vals,
    most_common_freqs
FROM pg_stats 
WHERE schemaname NOT IN ('information_schema', 'pg_catalog', 'timescaledb_information');
"

Set up retention policy alerting

Create monitoring functions to alert when retention policies fail or storage usage exceeds thresholds.

sudo -u postgres psql -d your_database -c "
-- Function to check failed retention jobs
CREATE OR REPLACE FUNCTION check_retention_health()
RETURNS TABLE(job_id INT, hypertable_name TEXT, last_run_status TEXT) AS \$\$
BEGIN
    RETURN QUERY
    SELECT 
        j.job_id,
        j.hypertable_name::TEXT,
        jh.last_run_status::TEXT
    FROM timescaledb_information.jobs j
    JOIN timescaledb_information.job_stats jh ON j.job_id = jh.job_id
    WHERE j.proc_name = 'policy_retention'
    AND jh.last_run_status != 'Success'
    AND jh.last_run_started_at > NOW() - INTERVAL '25 hours';
END;
\$\$ LANGUAGE plpgsql;

-- Function to check storage growth
CREATE OR REPLACE FUNCTION check_storage_usage()
RETURNS TABLE(table_name TEXT, size_mb BIGINT, growth_rate NUMERIC) AS \$\$
BEGIN
    RETURN QUERY
    SELECT 
        schemaname || '.' || tablename as table_name,
        pg_total_relation_size(schemaname||'.'||tablename) / (1024*1024) as size_mb,
        0::NUMERIC as growth_rate
    FROM pg_tables
    WHERE schemaname NOT IN ('information_schema', 'pg_catalog', 'timescaledb_information')
    AND pg_total_relation_size(schemaname||'.'||tablename) > 100  1024  1024; -- > 100MB
END;
\$\$ LANGUAGE plpgsql;
"

Understanding TimescaleDB retention policies

Retention policy types

Policy TypePurposeStorage ImpactPerformance Impact
Drop chunksPermanently removes old dataMajor reductionImproves query speed
CompressionCompresses historical data70-95% reductionSlower writes, faster reads
Tiered storageMoves data to cheaper storageCost reductionIncreased latency

Compression algorithm selection

TimescaleDB uses different compression algorithms based on data types and patterns.

Data TypeAlgorithmCompression RatioBest For
TIMESTAMPTZDelta encoding80-90%Regular time intervals
INTEGERDelta + bit packing70-95%Sensor IDs, counters
DOUBLE PRECISIONGorilla compression60-80%Temperature, metrics
TEXTDictionary encoding50-90%Status codes, labels

Verify your setup

Check that your retention and compression policies are working correctly.

# View all configured policies
sudo -u postgres psql -d your_database -c "
SELECT 
    job_id,
    application_name,
    schedule_interval,
    proc_name,
    hypertable_name,
    config
FROM timescaledb_information.jobs
WHERE proc_name IN ('policy_retention', 'policy_compression');"

Check policy execution history

sudo -u postgres psql -d your_database -c " SELECT job_id, last_run_started_at, last_run_status, last_run_duration, next_start FROM timescaledb_information.job_stats ORDER BY last_run_started_at DESC;"

View chunk information and compression status

sudo -u postgres psql -d your_database -c " SELECT chunk_name, range_start, range_end, is_compressed, compressed_chunk_id FROM timescaledb_information.chunks WHERE hypertable_name = 'sensor_data' ORDER BY range_start DESC;"

Check storage usage before and after policies

sudo -u postgres psql -d your_database -c " SELECT schemaname, tablename, pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) as size FROM pg_tables WHERE tablename LIKE '%sensor_data%';"

Test retention policy health

sudo -u postgres psql -d your_database -c "SELECT * FROM check_retention_health();"

Monitor storage usage

sudo -u postgres psql -d your_database -c "SELECT * FROM check_storage_usage();"
Note: Retention policies run asynchronously. Newly created policies may take up to 24 hours for their first execution depending on the schedule interval.

Common issues

SymptomCauseFix
Retention policy not runningBackground worker not enabledSet timescaledb.max_background_workers = 8 in postgresql.conf
Compression failingActive writes to old chunksIncrease compression interval or stop writes to historical data
High storage usage after retentionVACUUM not runningRun VACUUM FULL or enable autovacuum
Slow queries after compressionPoor segment/order by selectionAdjust compress_segmentby and compress_orderby based on query patterns
Policy job stuckLong-running transaction blockingCheck pg_stat_activity and terminate blocking queries
Retention deleting wrong dataIncorrect time zone handlingVerify timezone settings and use UTC consistently

Monitoring and managing retention policies

View retention policy execution logs

sudo -u postgres psql -d your_database -c "
SELECT 
    job_id,
    pid,
    start_time,
    finish_time,
    job_status,
    proc_name,
    data->'message' as message
FROM timescaledb_information.job_errors
WHERE finish_time > NOW() - INTERVAL '7 days'
ORDER BY start_time DESC;"

Manual policy execution for testing

Run retention policies manually to test configuration without waiting for the schedule.

# Get job ID for your retention policy
sudo -u postgres psql -d your_database -c "
SELECT job_id, hypertable_name 
FROM timescaledb_information.jobs 
WHERE proc_name = 'policy_retention';"

Execute specific job manually (replace 1000 with actual job_id)

sudo -u postgres psql -d your_database -c "CALL run_job(1000);"

Check execution results

sudo -u postgres psql -d your_database -c " SELECT * FROM timescaledb_information.job_stats WHERE job_id = 1000;"

Modify existing retention policies

# Change retention period
sudo -u postgres psql -d your_database -c "
SELECT alter_job(1000, config => '{\"drop_after\": \"60 days\"}');"

Change schedule frequency

sudo -u postgres psql -d your_database -c " SELECT alter_job(1000, schedule_interval => INTERVAL '12 hours');"

Disable a retention policy temporarily

sudo -u postgres psql -d your_database -c " SELECT alter_job(1000, scheduled => false);"

Re-enable the policy

sudo -u postgres psql -d your_database -c " SELECT alter_job(1000, scheduled => true);"

Remove retention policies

Warning: Removing retention policies stops automatic data cleanup. Storage usage will grow continuously without manual intervention.
# Remove retention policy (replace 'sensor_data' with your table)
sudo -u postgres psql -d your_database -c "
SELECT remove_retention_policy('sensor_data');"

Remove compression policy

sudo -u postgres psql -d your_database -c " SELECT remove_compression_policy('sensor_data');"

Next steps

Running this in production?

Want this handled for you? Setting up retention policies once is straightforward. Keeping them tuned, monitored, and optimized as your data patterns change is the harder part. See how we run database infrastructure like this for European SaaS and fintech teams.

Automated install script

Run this to automate the entire setup

Need help?

Don't want to manage this yourself?

We handle high availability infrastructure for businesses that depend on uptime. From initial setup to ongoing operations.