Set up Apache Airflow high availability with CeleryExecutor and Redis clustering

Advanced 45 min Apr 09, 2026 317 views
Ubuntu 24.04 Debian 12 AlmaLinux 9 Rocky Linux 9

Deploy Apache Airflow in high availability mode using CeleryExecutor with Redis clustering for task distribution, PostgreSQL connection pooling, and load-balanced webservers for production-grade workflow orchestration.

Prerequisites

  • Root or sudo access
  • At least 8GB RAM
  • PostgreSQL 12+ running
  • Basic knowledge of Apache Airflow

What this solves

Apache Airflow single-node deployments create bottlenecks and single points of failure for critical data pipelines. CeleryExecutor with Redis clustering distributes task execution across multiple worker nodes, providing horizontal scaling and fault tolerance. This setup handles thousands of concurrent tasks while maintaining system resilience through PostgreSQL connection pooling, Redis Sentinel failover, and load-balanced webservers.

Step-by-step installation

Update system packages and install dependencies

Start by updating your package manager and installing essential build dependencies for Apache Airflow and Redis compilation.

sudo apt update && sudo apt upgrade -y
sudo apt install -y python3-pip python3-dev python3-venv build-essential libssl-dev libffi-dev libpq-dev redis-server postgresql-client git curl wget
sudo dnf update -y
sudo dnf install -y python3-pip python3-devel gcc gcc-c++ openssl-devel libffi-devel postgresql-devel redis git curl wget make

Configure Redis cluster for Celery broker

Redis cluster provides high availability and automatic failover for Celery task queues. Configure three Redis nodes with clustering enabled.

port 6379
cluster-enabled yes
cluster-config-file nodes-6379.conf
cluster-node-timeout 15000
appendonly yes
bind 0.0.0.0
protected-mode no
maxmemory 2gb
maxmemory-policy allkeys-lru
save 900 1
save 300 10
save 60 10000

Create additional Redis cluster nodes

Configure two additional Redis instances on different ports to form a complete cluster with automatic sharding.

port 6380
cluster-enabled yes
cluster-config-file nodes-6380.conf
cluster-node-timeout 15000
appendonly yes
bind 0.0.0.0
protected-mode no
maxmemory 2gb
maxmemory-policy allkeys-lru
save 900 1
save 300 10
save 60 10000
port 6381
cluster-enabled yes
cluster-config-file nodes-6381.conf
cluster-node-timeout 15000
appendonly yes
bind 0.0.0.0
protected-mode no
maxmemory 2gb
maxmemory-policy allkeys-lru
save 900 1
save 300 10
save 60 10000

Start Redis cluster nodes

Launch all three Redis instances and initialize the cluster with automatic slot assignment.

sudo systemctl start redis@6379
sudo systemctl start redis@6380
sudo systemctl start redis@6381
sudo systemctl enable redis@6379 redis@6380 redis@6381

Initialize cluster

redis-cli --cluster create 127.0.0.1:6379 127.0.0.1:6380 127.0.0.1:6381 --cluster-replicas 0 --cluster-yes

Create Airflow system user

Create a dedicated system user for running Airflow services with proper isolation and security.

sudo useradd -r -m -s /bin/bash airflow
sudo mkdir -p /opt/airflow
sudo chown airflow:airflow /opt/airflow
sudo mkdir -p /var/log/airflow
sudo chown airflow:airflow /var/log/airflow

Install Apache Airflow with CeleryExecutor

Install Airflow with all required dependencies for CeleryExecutor, PostgreSQL, and Redis integration.

sudo -u airflow bash
cd /opt/airflow
python3 -m venv venv
source venv/bin/activate
export AIRFLOW_HOME=/opt/airflow
export PYTHONPATH=/opt/airflow
pip install --upgrade pip
pip install 'apache-airflow[celery,postgres,redis]==2.8.0' \
  --constraint "https://raw.githubusercontent.com/apache/airflow/constraints-2.8.0/constraints-3.11.txt"

Configure Airflow with CeleryExecutor

Set up Airflow configuration with CeleryExecutor, Redis cluster broker, and PostgreSQL backend for high availability.

[core]
executor = CeleryExecutor
sql_alchemy_conn = postgresql://airflow:airflow_password@localhost:5432/airflow
load_examples = False
dags_folder = /opt/airflow/dags
base_log_folder = /var/log/airflow
logging_level = INFO
fernet_key = $(python -c "from cryptography.fernet import Fernet; print(Fernet.generate_key().decode())")
max_active_runs_per_dag = 16
max_active_tasks_per_dag = 16

[webserver]
base_url = http://localhost:8080
web_server_port = 8080
workers = 4
worker_class = sync
worker_refresh_batch_size = 1
worker_refresh_interval = 30
reload_on_plugin_change = True

[celery]
broker_url = redis+cluster://localhost:6379,localhost:6380,localhost:6381/0
result_backend = db+postgresql://airflow:airflow_password@localhost:5432/airflow
worker_concurrency = 16
worker_log_server_port = 8793
worker_prefetch_multiplier = 1
task_always_eager = False
task_acks_late = True
task_reject_on_worker_lost = True
worker_enable_remote_control = True

[celery_broker_transport_options]
visibility_timeout = 21600
fanout_prefix = True
fanout_patterns = True

[scheduler]
min_file_process_interval = 0
dag_dir_list_interval = 300
scheduler_heartbeat_sec = 5
max_threads = 2
catchup_by_default = False

Configure PostgreSQL connection pooling

Set up PgBouncer for connection pooling to handle multiple Airflow components efficiently and prevent connection exhaustion.

sudo apt install -y pgbouncer
sudo dnf install -y pgbouncer
[databases]
airflow = host=localhost port=5432 dbname=airflow

[pgbouncer]
listen_addr = *
listen_port = 6432
auth_type = md5
auth_file = /etc/pgbouncer/userlist.txt
pool_mode = transaction
max_client_conn = 200
default_pool_size = 20
max_db_connections = 100
server_reset_query = DISCARD ALL

Initialize Airflow database

Create the PostgreSQL database and initialize Airflow schema with admin user creation.

sudo -u postgres createdb airflow
sudo -u postgres psql -c "CREATE USER airflow WITH PASSWORD 'airflow_password';"
sudo -u postgres psql -c "GRANT ALL PRIVILEGES ON DATABASE airflow TO airflow;"

sudo -u airflow bash
cd /opt/airflow
source venv/bin/activate
export AIRFLOW_HOME=/opt/airflow
airflow db init
airflow users create --username admin --firstname Admin --lastname User --role Admin --email admin@example.com --password admin_password

Create systemd services for Airflow components

Configure systemd services for Airflow scheduler, webserver, and Celery workers with proper dependency management.

[Unit]
Description=Airflow Scheduler
After=network.target postgresql.service redis.service
Wants=postgresql.service redis.service

[Service]
EnvironmentFile=-/opt/airflow/airflow.env
User=airflow
Group=airflow
Type=notify
PIDFile=/run/airflow/scheduler.pid
RuntimeDirectory=airflow
WorkingDirectory=/opt/airflow
ExecStart=/opt/airflow/venv/bin/airflow scheduler
ExecReload=/bin/kill -s HUP $MAINPID
KillMode=mixed
TimeoutStopSec=300
Restart=always

[Install]
WantedBy=multi-user.target
[Unit]
Description=Airflow Webserver
After=network.target postgresql.service redis.service
Wants=postgresql.service redis.service

[Service]
EnvironmentFile=-/opt/airflow/airflow.env
User=airflow
Group=airflow
Type=notify
PIDFile=/run/airflow/webserver.pid
RuntimeDirectory=airflow
WorkingDirectory=/opt/airflow
ExecStart=/opt/airflow/venv/bin/airflow webserver
ExecReload=/bin/kill -s HUP $MAINPID
KillMode=mixed
TimeoutStopSec=300
Restart=always

[Install]
WantedBy=multi-user.target
[Unit]
Description=Airflow Celery Worker
After=network.target postgresql.service redis.service
Wants=postgresql.service redis.service

[Service]
EnvironmentFile=-/opt/airflow/airflow.env
User=airflow
Group=airflow
Type=notify
PIDFile=/run/airflow/worker.pid
RuntimeDirectory=airflow
WorkingDirectory=/opt/airflow
ExecStart=/opt/airflow/venv/bin/airflow celery worker
ExecReload=/bin/kill -s HUP $MAINPID
KillMode=mixed
TimeoutStopSec=300
Restart=always

[Install]
WantedBy=multi-user.target

Configure environment variables

Set up environment file with Airflow configuration and Python path for all services.

AIRFLOW_HOME=/opt/airflow
PYTHONPATH=/opt/airflow
AIRFLOW__CORE__EXECUTOR=CeleryExecutor
AIRFLOW__CORE__SQL_ALCHEMY_CONN=postgresql://airflow:airflow_password@localhost:6432/airflow
AIRFLOW__CELERY__BROKER_URL=redis+cluster://localhost:6379,localhost:6380,localhost:6381/0
AIRFLOW__CELERY__RESULT_BACKEND=db+postgresql://airflow:airflow_password@localhost:6432/airflow
sudo chown airflow:airflow /opt/airflow/airflow.env
sudo chmod 640 /opt/airflow/airflow.env

Configure HAProxy load balancer

Set up HAProxy for load balancing multiple Airflow webserver instances with health checks and failover.

sudo apt install -y haproxy
sudo dnf install -y haproxy
global
    daemon
    user haproxy
    group haproxy
    log 127.0.0.1:514 local0
    chroot /var/lib/haproxy
    stats socket /run/haproxy/admin.sock mode 660 level admin
    stats timeout 30s

defaults
    mode http
    log global
    option httplog
    option dontlognull
    timeout connect 5000
    timeout client 50000
    timeout server 50000
    errorfile 400 /etc/haproxy/errors/400.http
    errorfile 403 /etc/haproxy/errors/403.http
    errorfile 408 /etc/haproxy/errors/408.http
    errorfile 500 /etc/haproxy/errors/500.http
    errorfile 502 /etc/haproxy/errors/502.http
    errorfile 503 /etc/haproxy/errors/503.http
    errorfile 504 /etc/haproxy/errors/504.http

frontend airflow_frontend
    bind *:80
    default_backend airflow_webservers

backend airflow_webservers
    balance roundrobin
    option httpchk GET /health
    http-check expect status 200
    server airflow1 127.0.0.1:8080 check
    server airflow2 127.0.0.1:8081 check backup

listen stats
    bind *:8404
    stats enable
    stats uri /stats
    stats refresh 30s
    stats admin if TRUE

Start and enable all services

Enable and start all Airflow services, Redis cluster, PgBouncer, and HAProxy with proper startup order.

sudo systemctl daemon-reload
sudo systemctl enable --now pgbouncer
sudo systemctl enable --now haproxy
sudo systemctl enable --now airflow-scheduler
sudo systemctl enable --now airflow-webserver
sudo systemctl enable --now airflow-worker

Configure Celery worker auto-scaling

Set up Celery flower for monitoring and configure worker auto-scaling based on queue length and system resources.

sudo -u airflow bash
cd /opt/airflow
source venv/bin/activate
pip install flower
exit
[Unit]
Description=Airflow Celery Flower
After=network.target postgresql.service redis.service
Wants=postgresql.service redis.service

[Service]
EnvironmentFile=-/opt/airflow/airflow.env
User=airflow
Group=airflow
Type=simple
WorkingDirectory=/opt/airflow
ExecStart=/opt/airflow/venv/bin/airflow celery flower --port=5555
Restart=always

[Install]
WantedBy=multi-user.target
sudo systemctl enable --now airflow-flower

Verify your setup

Check all service statuses and verify Airflow is running correctly with CeleryExecutor and Redis clustering.

sudo systemctl status redis@6379 redis@6380 redis@6381
sudo systemctl status pgbouncer haproxy
sudo systemctl status airflow-scheduler airflow-webserver airflow-worker airflow-flower

Check Redis cluster status

redis-cli -p 6379 cluster nodes redis-cli -p 6379 cluster info

Test Airflow web interface

curl -s http://localhost/health curl -s http://localhost:5555 # Flower monitoring

Check Celery workers

sudo -u airflow bash -c "cd /opt/airflow && source venv/bin/activate && airflow celery inspect active"

Common issues

Symptom Cause Fix
Tasks stuck in queued state Celery workers not connected to Redis Check Redis cluster status and worker logs: journalctl -u airflow-worker
Database connection errors PgBouncer pool exhaustion Increase pool size in pgbouncer.ini and restart: sudo systemctl restart pgbouncer
Redis cluster initialization fails Port conflicts or firewall Check port availability: netstat -tlnp | grep 637 and configure firewall
HAProxy backend servers down Webserver health check failing Check webserver status and logs: journalctl -u airflow-webserver
Permission denied errors Incorrect file ownership Fix ownership: sudo chown -R airflow:airflow /opt/airflow /var/log/airflow

Next steps

Automated install script

Run this to automate the entire setup

Need help?

Don't want to manage this yourself?

We handle managed devops services for businesses that depend on uptime. From initial setup to ongoing operations.