Infrastructure tutorials

Production-grade guides for Linux, servers, security and performance. Copy-paste commands, multi-distro support, written by engineers who run this in production.

devops Advanced

Configure Spark on Kubernetes with cluster autoscaling for dynamic workloads

Deploy Apache Spark 3.5 on Kubernetes with automatic cluster scaling, dynamic resource allocation, and comprehensive monitoring for production data processing workloads.

45 min 4 distros 185 views
devops Advanced

Set up Spark Streaming with Kafka and Delta Lake for real-time analytics

Configure Apache Spark 3.5 with Kafka integration and Delta Lake support for building production-grade real-time analytics pipelines with ACID transactions and streaming capabilities.

45 min 4 distros 156 views
performance Advanced

Implement Spark SQL performance optimization with Catalyst optimizer and advanced tuning

Optimize Apache Spark 3.5 SQL performance using Catalyst optimizer with advanced query tuning, adaptive query execution, and production-grade configuration for high-throughput analytics workloads.

45 min 4 distros 95 views
devops Advanced

Implement Kafka Streams exactly-once processing semantics with Java applications

Configure Kafka cluster and Java applications for exactly-once processing semantics with transaction state management, idempotent producers, and EOS isolation levels for reliable stream processing.

45 min 4 distros 116 views
devops Intermediate

Configure Kafka Schema Registry with Avro serialization for data processing

Set up Confluent Schema Registry with Avro serialization to manage schemas and ensure data compatibility in your Kafka streaming applications. This guide covers installation, schema management, and producer/consumer configuration.

45 min 4 distros 256 views
devops Advanced

Configure Kafka Streams state stores and RocksDB optimization for high-performance streaming applications

Configure Kafka Streams state stores with RocksDB optimization for high-performance streaming applications. Learn custom state store configurations, RocksDB tuning parameters, and monitoring techniques for production-grade stream processing.

45 min 4 distros 173 views
devops Intermediate

Set up Kafka Streams testing framework with TopologyTestDriver for automated stream processing validation

Configure a complete testing framework for Kafka Streams applications using TopologyTestDriver to validate stream processing logic with automated tests and mock data pipelines.

45 min 4 distros 137 views
devops Advanced

Set up Kafka Connect cluster with high availability and load balancing

Configure a production-ready Kafka Connect cluster with multiple worker nodes, HAProxy load balancing, and Prometheus monitoring. Includes distributed configuration, shared storage setup, and comprehensive health checks for reliable data pipeline processing.

45 min 4 distros 235 views
devops Intermediate

Configure Kafka Streams for real-time data processing and analytics

Set up Kafka Streams applications with Java development environment to build real-time data processing pipelines for analytics and monitoring workloads.

45 min 4 distros 186 views
devops Advanced

Configure Spark Kubernetes Operator with MinIO for cloud-native analytics

Deploy Apache Spark on Kubernetes with the Spark Operator and MinIO object storage for scalable big data processing. Configure RBAC, SSL certificates, and persistent storage for production-ready analytics workloads.

45 min 4 distros 207 views
performance Advanced

Optimize ClickHouse performance for high-throughput workloads with advanced tuning and memory management

Learn how to optimize ClickHouse for high-throughput analytics workloads through advanced memory configuration, query performance tuning, storage engine optimization, and connection pooling strategies.

45 min 4 distros 219 views
databases Intermediate

Configure ClickHouse materialized views for real-time analytics with performance optimization

Set up ClickHouse materialized views to transform raw data into real-time aggregations. Configure performance optimization with memory tuning and monitoring for high-throughput analytics workloads.

45 min 4 distros 274 views

Need help?

Don't want to manage this yourself?

We handle infrastructure for businesses that depend on uptime. From initial setup to ongoing operations.

Talk to an engineer