Infrastructure tutorials

Production-grade guides for Linux, servers, security and performance. Copy-paste commands, multi-distro support, written by engineers who run this in production.

devops Intermediate

Implement Consul backup and disaster recovery with automated snapshots and restoration

Set up automated Consul snapshots with GPG encryption, systemd timers, and complete disaster recovery procedures. Includes monitoring integration with Prometheus and automated restoration workflows for production environments.

45 min 4 distros 253 views
devops Advanced

Configure Apache Airflow high availability with CeleryExecutor and Redis clustering for production deployments

Set up Apache Airflow with CeleryExecutor and Redis clustering for high availability production deployments. Configure multiple workers, load balancing, monitoring, and automated failover to handle enterprise-scale workflow orchestration with zero downtime.

45 min 4 distros 302 views
monitoring Intermediate

Set up Prometheus and Grafana monitoring stack with Docker Compose

Deploy a complete monitoring solution using Prometheus for metrics collection and Grafana for visualization with Docker Compose. This setup provides comprehensive system monitoring, alerting capabilities, and customizable dashboards.

25 min 4 distros 332 views
monitoring Intermediate

Set up Docker Compose monitoring stack with Prometheus and Grafana for AI model performance tracking

Deploy a complete monitoring stack using Docker Compose with Prometheus for metrics collection and Grafana for visualization, specifically configured to track AI model performance metrics like inference latency, throughput, and resource utilization.

45 min 4 distros 295 views
monitoring Intermediate

Configure backup monitoring with Prometheus and Grafana for automated infrastructure oversight

Set up comprehensive backup monitoring using Prometheus exporters and Grafana dashboards. Configure automated alerts for backup failures, track success rates, and visualize backup infrastructure health across multiple systems.

45 min 4 distros 275 views
monitoring Intermediate

Monitor Django applications with Prometheus and Grafana for comprehensive performance insights

Set up comprehensive Django application monitoring using Prometheus metrics collection and Grafana dashboards. Configure django-prometheus middleware to track request metrics, database queries, and application performance with real-time alerting.

45 min 4 distros 406 views
monitoring Intermediate

Implement Grafana alerting with Prometheus and InfluxDB for comprehensive monitoring

Set up comprehensive Grafana alerting using both Prometheus metrics and InfluxDB time-series data to monitor your infrastructure from multiple data sources. This tutorial covers configuring data sources, creating alert rules, and setting up notification channels for production monitoring.

45 min 4 distros 302 views
databases Advanced

Implement ScyllaDB disaster recovery with cross-region replication

Set up ScyllaDB multi-region cluster with automated backup strategies, cross-datacenter replication, and failover automation for enterprise-grade disaster recovery and business continuity.

180 min 4 distros 331 views
monitoring Advanced

Monitor MariaDB Galera cluster with Prometheus and Grafana for high availability insights

Configure comprehensive monitoring for MariaDB Galera clusters using Prometheus exporters and Grafana dashboards to track cluster health, replication status, and performance metrics with automated alerting for production environments.

45 min 4 distros 307 views
monitoring Advanced

Set up keepalived cluster monitoring with Prometheus alerts and Grafana dashboards

Configure comprehensive monitoring for keepalived VRRP clusters using Prometheus metrics collection, alerting rules for failover events, and Grafana dashboards for high availability visualization.

45 min 4 distros 291 views
monitoring Advanced

Implement Prometheus federation for multi-cluster monitoring with centralized metrics aggregation

Set up hierarchical Prometheus federation to monitor multiple Kubernetes clusters with a central aggregation layer. Configure global and local Prometheus instances with federated scrape jobs, service discovery, and unified dashboards for enterprise-scale observability.

45 min 4 distros 348 views
monitoring Advanced

Set up Thanos Receiver for remote write scalability with Prometheus integration

Configure Thanos Receiver to handle high-volume remote write traffic from multiple Prometheus instances. This tutorial covers installation, multi-tenancy setup, and performance optimization for large-scale metrics ingestion.

45 min 4 distros 617 views

Need help?

Don't want to manage this yourself?

We handle infrastructure for businesses that depend on uptime. From initial setup to ongoing operations.

Talk to an engineer